Ayuda
Ir al contenido

Dialnet


Resumen de Representing multiword expressions in lexical and terminological resources: an analysis for natural language processing purposes

Carla Parra Escartín, Gyri Smørdal Losnegaard, Gunn Inger Lyse Samdal, Pedro Patiño García

  • In the context of standardisation and interoperability of Language Resources and Tools (LRT), this paper addresses the formal representation of multiword expressions (MWEs) for Natural Language Processing (NLP) purposes. By formal representation we mean the encoding of MWEs in lexical and terminological databases. The representation should render a language resource maximally reusable and ideally allow for seamless integration into any type of NLP application. In the case of MWEs, the situation is particularly complex due to their lexical properties on the one hand, and morphosyntactic variation on the other. Furthermore, their representation in multilingual resources poses even bigger challenges due to extensive translational asymmetry. In this paper we discuss the challenges posed by the formal representation of MWES. We analyse the needs of four different projects, all NLP oriented, but with slightly different approaches to the collection and representation of MWES. Based on the analysis, we identify a minimal set of features to be accounted for in any formal representation of MWES, as well as a set of more specific task-dependent requirements hinging on the intended use of the lexical resource. Finally, we assess to what extent existing standards meet these requirements.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus