Laboratorio de Lingüística Informática

History

The Computational Linguistics Laboratory ('Laboratorio de Lingüística Informática', LLI) is a research group recognized and supported by the Autonomous University of Madrid (Universidad Autónoma de Madrid, UAM). The background and history of the laboratory, and its contributions during these last 25 years, can be read in the following link (from the journal Maestros de la filología, 2014).

Click here to download the document with the history of the LLI-UAM group (in Spanish, PDF format).

The Laboratorio de Lingüística Informática was created at UAM-IBM Research Centre, after the joining of Francisco Marcos Marín as General Linguistics Professor in 1981.

In the 80s, the works had a double goal: on one side, the collaboration with IBM on immediate projects as spelling checkers, vocabularies, tools development for the new personal computers. On the other side, works on the application of computers to Philology were started, especially on unified editions and critics. This second work later led to programs of electronic critic editions, as UNITE and more extensive projects as ADMYTE, Manuscript Digital File and Electronic Texts.

The work started at UAM-IBM Scientific Centre spread out to the equivalent IBM centre in Heidelberg, thanks to a grant awarded by the Alexander von Humboldt Stiftung to Francisco Marcos Marín. Between 1985 and 1987, the first great application of the computer programs to the text edition was carried out on the Libro de Alexandre. The work done between Madrid and Germany lead the group to contact with other European groups that started out in linguistic and computer activities, particularly with the group that started with the EUROTRA project of computer translation, supported by the then European Commission.

Although the Laboratory was formed at UAM-IBM Scientific Centre, it became a solid foundation with the EUROTRA project. Together with researchers who had worked at the Centre, as Antonio Moreno Sandoval, others joined the Laboratory, as Fernando Sánchez León and Flora Ramírez Bustamante, whose role was crucial for the tasks developed since then.

Early in the 90s, the work on Eurotra was combined with the one on the digital files supported by the Sociedad Estatal del Quinto Centenario. That is the reason for a split in the work at the Laboratory and its projects, the philological-textual orientation, on one side, and the corpus linguistics, on the other. Between both extremes there are many links, without leaving out of account projections towards new possibilities. That is why the Laboratory is a centre on permanent restlessness, always open to collaborations and consortia. The LLI-UAM group occupies its own place in the interdisciplinary research area between computers and language, both in Spain and the Spanish-speaking community.

Since the year 2000, the LLI-UAM group has specialized in compiling corpora: parallel corpora (Arabic-Spanish-English), spontaneous speech corpora (C-ORAL-ROM), child speech corpora (CHIEDE), multimodal corpora (MAVIR), foreign/second language oral corpora (Spanish Learner Oral Corpus and French Learner Oral Corpus) and specialized language corpora (MultiMedica). The LLI-UAM group has also created several linguistic resources: acoustic data bases, applications of corpora for foreign/second language teaching and learning (Textos de español oral, UAM Publishing Services, 2010), electronic dictionaries (Japanese-English-Spanish, and French prepositions), and a morphological analyzer of Arabic verbs (JABALÍN).

The LLI-UAM group has a close collaboration with several researchers and professors of the Departments of Computer Science Engineering and Telecommunications Engineering from the UAM. Since December 2009, the LLI-UAM group collaborates with the Instituto de Ingeniería del Conocimiento, a research and development private, non-lucrative institution at the UAM Cantoblanco campus.