Founded by UAM and the Santander Bank
Period: from 1 July, 2013 through 31 December, 2014

Principal Investigator in Spain: Antonio Moreno Sandoval (UAM)
Principal Investigators in Japan: Hiroto Ueda (University of Tokyo), Toshihiro Takagaki (Tokyo University of Foreign Studies), Antonio Ruiz Tinoco (Sophia University)

This project is a collaboration between the Laboratorio de Lingüística Informática (LLI) and three Japanese institutions: Tokyo University of Foreign Studies (TUFS), University of Tokyo, and Sophia University. The goals of the project are: 1. to reuse electronic linguistic resources developed by each team, and 2. to integrate corpora and electronic resources to perform linguistic analyses on the Spanish language.

  1. To gather and reuse the different corpora and taggers to carry out research on the Spanish language.
  2. To develop and improve electronic tools for language analysis (e.g. Part-of-Speech analyzers).
  3. To share didactic materials for Spanish.
  4. To publish the research results achieved between researchers.

All of that will positively improve institutional relations between the universities involved, which are highly ranked institutions in both national and international contexts. There is indeed already an international agreement and an international exchange program between TUFS and UAM for teachers and students.

  1. Design of resource integration: analysis of needs and costs for integrating the corpora into the electronic search interface.
  2. Adaptation of existing resources to the format needed for the electronic search interface.
  3. Integration of electronic linguistic resources: the GRAMPAL PoS analyzer (Laboratorio de Lingüística Informática) and the LETRAS tool (developed by Professor Hiroto Ueda, University of Tokyo).
  4. Indexation of corpora into the database of the search interface.
  5. Development of the web interface, following the methodology from the LLI.
  6. Linguistic analyses using the electronic resources (e.g. frequency lists or analysis of syntactic patterns).


