Main

Laboratorio de Lingüística Informática

C-ORAL JAPÓN

At LLI, work on Japanese language is focused on two linguistic resources: a Japanese corpus and a dictionary of basic vocabulary. Both resources were developed as part of the research work of the Professor Chieko Kimura. Other researchers, such as Shin Abe, Kengo Matsui and Marta Garrote, also participated in the development of these resources.


C-ORAL-JAPÓN CORPUS WEB PAGE

Spoken Japanese Corpus

The spoken Japanese corpus is the result of the research work of the Professor Chieko Kimura. It is made up of more than 12 hours of recording, divided into three gropus, according to the kind of interaction: monologues, dialogues and conversations. The corpus essential data are shown below:

C-ORAL-JAPON
Tipo Archivo Caracteres Duración Localización
Conversación jcv01 2,690 0:13:01 Tokio
  jcv02 1,505 0:07:49 Tokio
  jcv03 1,913 0:09:49 Tokio
  jcv04 2,311 0:10:51 Tokio
  jcv05 1,856 0:08:48 Tokio
  jcv06 4,120 0:18:21 Tokio
Diálogo jdl01 5,419 0:30:37 Madrid
  jdl02 7,520 0:37:59 Madrid
  jdl03 2,977 0:13:56 Madrid
  jdl04 1,234 0:06:12 Tokio
  jdl05 2,615 0:10:38 Madrid
  jdl06 2,297 0:09:53 Madrid
  jdl07 2,976 0:18:08 Madrid
  jdl08 3,901 0:22:36 Madrid
  jdl09 3,012 0:14:43 Madrid
  jdl010 3,328 0:17:07 Madrid
  jdl011 1,462 0:07:09 Madrid
  jdl012 1,452 0:06:23 Madrid
  jdl013 3,112 0:16:35 Madrid
  jdl014 2,905 0:12:42 Madrid
  jdl015 2,648 0:15:27 Tokio
  jdl016 2,750 0:13:16 Tokio
  jdl017 1,405 0:07:08 Tokio
Monólogo jmn01 2,041 0:11:18 Madrid
  jmn02 2,887 0:16:20 Tokio
  jmn03 1,482 0:10:17 Tokio
  jmn04 7,552 0:38:47 Tokio
  jmn05 2,867 0:22:45 Tokio
  jmn06 7,683 0:52:31 Tokio
  jmn07 3,170 1:05:53 Tokio
  jmn08 2,962 0:19:37 Tokio
  jmn09 9,898 0:59:43 Shizouka
  jmn010 979 0:08:08 Shizouka
  jmn011 948 0:05:24 Shizouka
  jmn012 1,171 0:23:28 Shizouka
  jmn013 1,409 0:09:23 Shizouka
  jmn014 644 0:04:19 Shizouka
  jmn015 10,016 0:50:50 Madrid
  jmn016 4,177 0:27:09 Madrid
Total   125,294 12:35:00  

Currently, the work is focused on the following aims:






Main Main