Ayuda
Ir al contenido

Dialnet


Avaluació de la qualitat terminològica en traducció automàtica basada en corpus del turc a l'anglès en l'àmbit mèdic

  • Autores: Gokhan Dogru
  • Directores de la Tesis: Adrià Martín Mor (dir. tes.), Anna Aguilar-Amat (codir. tes.)
  • Lectura: En la Universitat Autònoma de Barcelona ( España ) en 2021
  • Idioma: español
  • Tribunal Calificador de la Tesis: Maite Melero Nogués (presid.), Olga Torres Hostench (secret.), Mehmet Sahin (voc.)
  • Programa de doctorado: Programa de Doctorado en Traducción y Estudios Interculturales por la Universidad Autónoma de Barcelona
  • Materias:
  • Enlaces
    • Tesis en acceso abierto en: TDX
  • Resumen
    • General quality aspects of machine translation (MT) such as adequacy and fluency are studied extensively, more fine-grained aspects such as the terminology translation quality have not received much attention especially in the context of translation studies. The objective of this study is to analyze the types and frequencies of terminology errors in custom statistical machine translation (SMT) and neural machine translation (NMT) with the goal of understanding how MT system type, corpus type and corpus size affect the terminology translation quality.

      A Turkish – English parallel corpus obtained from cardiology journal abstracts was built from scratch for training domain-specific SMT and NMT engines. Then, this domain-specific corpus is combined with a mixed domain corpus and two more engines were trained. After conducting automatic evaluation and human evaluation on these 4 engines, terminology errors were annotated based on a custom terminology error typology. It was found that the types and frequencies of terminology errors are significantly different in SMT and NMT systems, and that changes in corpus size and corpus type had more drastic impact on NMT compared to SMT.

      A key contribution of the dissertation to the MT research is the crafted language-agnostic terminology error typology which can be used for evaluating the relative strengths and weakness of different MT systems in terms of terminology. Besides, the finding that NMT systems exhibit different types of term errors with different frequencies implies that postediting guidelines conceived specifically for SMT systems could require changes to accommodate the behavior pattern of NMT.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno