Main

Laboratorio de Lingüística Informática



Research



My basic research line is the development of computational grammars and dictionaries for Spanish. As "grammar writer", I participated in different projects: Eurotra, LanguageAccess and PROTEUS. Since December 1997, I am the Principal Investigator for a project to develop a syntactically annotated corpus of Spanish (UAM Spanish Treebank), with partial funding by the New York University.

At the theoretical level, I am working on unification formalism since 1988. The most important work in this field is my dissertation: Un modelo computacional basado en la unificación para el análisis y generación de la morfología del español. As a result, a collaboration was started with the ARIES group at the E.T.S.I. de Telecommunicaciones, at the Universidad Politécnica de Madrid, which yielded different lexical resources of Spanish. From 2000 to 2002, we carried out a coordinated project funded by the CICYT: ACORDEON (Aplicaciones Cooperativas de Recuperación de Información). By the end of 2001, my book Gramáticas de unificación y rasgos was published by Visor/Antonio Machado Libros.

Other theoretical research lines are morphology (specially Spanish morphosyntax and its computational processing) and Linguistic methodology (symbolic and statistical models, exception handling, theory evaluation).

As a result of a joint action (1994-1995) with the University of Augsburg (Germany) on the reuse of americanisms dictionaries to create a lexical database of American Spanish, I became interested in texts codification (specially in lexicographical ones) in digital format. I worked on the elaboration of criteria to evaluate printed dictionaries. Also, I gave some courses on computational terminology and lexicography.

Information Retrieval is another area of research: from January, 2000 to December, 2002 I was the Principal Investigator for the project ACORDEON

The development of linguistic resources is also one of the main research lines. Recently, I was the Principal Investigator of the Spanish team for the European project C-ORAL-ROM (the local page of the project is http://www.lllf.uam.es/c-oral-rom/index.html). This project officially finished on May, 2004, but we are still working on the resources improvement.

The corpus is available in two formats:

On January, 2005 we started a new project on multilingual information retrieval RILARIM (Recursos de Ingenieria Linguistica Aplicados a la Recupercion de Informacion Multilingue), subproject with the coordinated project RIMMEL (subsidized by the Ministry of Education and Science, TIN2004-07588-C03-02. December, 13 2004 to December, 12 2007. Principal Investigator: Antonio Moreno Sandoval). 

On January, 2006, together with other 5 groups of the Comunidad de Madrid, we began the programme MAVIR (Mejorando el acceso y la visibilidad de la información multilingüe en red). I am the responsable for the group PLN@UAM, in which Enrique Alfonseca also participates as distinguished member.

Since the end of 2007, we are involved in a new project together with our partners from UC3M and UPM: BRAVO (Búsqueda de Respuestas Avanzada Multimodal y Multilingüe, subsidized by the Ministry of Education and Science). Our main task is the development of linguistic resources in Spanish, Arabic and Japanese.




Main Main