Natural Language Processing for MEDical TERMinology

Project funded by InterTalentum UAM, Marie Skłodowska-Curie COFUND, (2019-2021) at The Autonomous University of Madrid (Universidad Autónoma de Madrid)


The NLPMedTerm project aims at providing the research community with resources for natural language processing (NLP) in the health domain for Spanish.

Work Package 1: an enriched lexicon of medical terms. It includes Concept Unique Identifiers (CUI) from the Unified Medical Language System© (UMLS©). → Deliverable 1 star
Linguistic information of terms is provided, and the Part-of-Speech (PoS) category is currently being added. Compositional/derivational data of medical terms are provided, and the equivalence between synonym roots and affixes (e.g. cardio- / cardiac-).

Work Package 2: a corpus texts annotated with medical entities as a resource for experiments in Named Entity Recognition. The corpus is aimed at training machine-learning models incorporating state-of-the-art neural network approaches. → Deliverable 2 star
In this Work Package, we have also created word embeddings from the medical domain → Deliverable 3 star

Collaborators in Work Package 2:

The project favours continuity with future projects for improving the indexing of online repositories of biomedical articles, or developing lexicographic resources considering different varieties of Spanish.



Leonardo Campillos-Llanos, PhD, postdoctoral researcher.

Computational Linguistics Laboratory, Universidad Autónoma de Madrid

name.surname AT
name.surname AT

Interdisciplinary collaborations


Last update: January 2021.