Main

Laboratorio de Lingüística Informática

FinT-Esp: Financial text analytics in Spanish: Tools and language resources.

Program MINECO 2017
Research proyect assigned to "Proyectos de I+D+I del Programa Estatal de Investigación, Desarrollo e Innovación Orientada a los Retos de la sociedad."
Since January 2018 until June 2021

The process of automatically analysing textual content is called Text Analytics. This process can be widely applied to different fields: from analyses of social network comments to information extraction from legal, medical or financial texts.

Text Analytics' main challenge is to understand the content of the linguistic utterances and to show relevant information. In order to achieve these goals, different techniques are used, including statistics (data mining) or rule-based procedures.

Our approach is based on the Computational Linguistics traditional method: we annotate the relevant information appearing in non-structured texts through domain-specific rules and lexicons. Then, we analyze such information in terms of quantity and quality by using corpus linguistics tools (Lyneal y Wmatrix).

Our proposal brings together the experience of two internationally recognized research teams, the Laboratorio de Lingüí­stica Informática at Universidad Autónoma de Madrid (LLI-UAM) and the University Centre for Computer Corpus Research on Language (UCREL) at Lancaster University. For more than two decades, these teams have independently developed language processing tools and corpora. The main goal of this proposal is to integrate Spanish within the tools developed by UCREL, in order to use them to analyse financial texts, more specifically companies' annual reports. For this purpose, a Spanish corpus of financial texts will be collected and annotated with a new version of the Semantic Tagger of UCREL.

The topics and the results of the project are fully within the framework Reto 7 'Digital Economy and Society', because they help to process and understand financial documents in digital format. Language technologies are included within a strategic plan of Digital Agenda for Spain. The industrial transfer of the results could be carried out through softwares and services developed and offered by research institutions, such as Instituto de Ingeniería del Conocimiento, which is a private non-profit body dedicated to research. It is located at UAM Campus, where some of the research team members collaborate.

Search page



Papers

Moreno-Sandoval, A., Gisbert, A., Haya, P.A., Guerrero, M. y Montoro, H.: "Tone Analysis in Spanish Financial Reporting Narratives." In Proceedings of the Second Financial Narrative Processing Workshop (FNP 2019). NoDaLiDa, Turku, Finlandia, 30 Sept 2019, pp. 42-50

Moreno-Sandoval, A., Gisbert, A. y Montoro, H. "FinT-esp: a corpus of financial reports in Spanish. " Presented at CILC-2019, Valencia. To be published by ed. Comares.

Moreno-Sandoval, A.: "Possibility and necessity in financial narrative: a study of modal adverbs in Spanish. " Presented at XI Congreso Internacional de Lingüistica de Corpus (CILC-2019), Valencia. Publicada en Actas.

Moreno-Sandoval, A.: "Some discursive aspects of financial narrative in Spanish: modality, lexical distinctiveness and sentiment analysis " Keynote speaker at 3rd International Conferenceon Corpus Analysis in Academic Discourse 2019 (CAAD'19).

Main


Main Main