Ayuda
Ir al contenido

Dialnet


A Lightweight Statistical Method for Terminology Extraction

    1. [1] Universidad Nacional de Cuyo

      Universidad Nacional de Cuyo

      Argentina

    2. [2] Pontificial Catholic University of Valparaiso
  • Localización: Journal of Computer-Assisted Linguistic Research, ISSN-e 2530-9455, Nº. 7, 2023, págs. 43-59
  • Idioma: inglés
  • Enlaces
  • Resumen
    • We propose a method for the task of automatic terminology extraction in the context of a larger project devoted to the automation of part of the tasks involved in the production of terminological databases. Terminology extraction is the key to drafting the macrostructure of a terminological resource (i.e., the list of entries), to which information can be later added at the microstructural level with grammatical or semantic information. To this end, we developed a statistical method that is conceptually simple compared to modern neural network approaches. It is a lightweight method because it is based on term dispersion and co-occurrence statistics that can be computed with basic hardware. For the evaluation, we experimented with corpora of lexicography and linguistics in English and Spanish of ca. 66 million tokens. Results improve baselines in almost 20%.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno