|
CORLEC |
C-ORAL-ROM |
CHIEDE |
ARABIC-SPANISH CORPUS |
MAVIR CORPUS |
C-ORAL-CHINA |
C-ORAL-JAPÓN |
Compilation date |
1990-92 |
2001-04 |
2008 |
2005 |
2006-08 |
2010-11 |
2010-11 |
Type of corpus |
Oral |
Oral |
Oral |
Written |
Oral |
Oral |
Oral |
Languages |
Spanish |
Spanish, Portuguese, Italian, French |
Spanish |
Spanish, Arabic, English |
Spanish, English |
Chinese |
Japanese |
Number of words |
1.100.000 |
312.000 for each language |
60.000 |
4.000 for each language |
103.000 |
140.000 characters |
235.000 characters |
Type of recording |
Analogical |
Digital |
Digital |
|
Digital |
Digital |
Digital |
Annotation levels |
Features of speech |
Prosody and morphology. Partial semantics and pragmatics |
Prosody, morphology and phonology |
Estructure (paragraphs, sentences and tokens), categories and partial pragmatics |
Prosody |
Prosody |
Prosody |
Text-sound alignment |
No |
Yes |
Yes |
|
Yes |
Yes |
Yes |
Participants' permit |
No |
Yes |
Yes |
Not necessary |
Yes |
Yes |
Yes |
Validation |
No |
Yes, internal and external |
Yes, internal |
|
Yes, internal |
Yes, internal |
Yes, internal |
Search engine |
No |
Yes |
Yes |
No |
No |
Yes |
Yes |
User guide |
No |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
Phonological transcription |
No |
No |
Yes |
No |
No |
Pinyin |
No |
|