Using annotated discourse information of a rst spanish-chinese treebank for translation and language learning tasks

Shuyuan Cao

Ayuda

Using annotated discourse information of a rst spanish-chinese treebank for translation and language learning tasks

Autores: Shuyuan Cao
Directores de la Tesis: Iria da Cunha Fanego (dir. tes.), Mikel Iruskieta Quintian (codir. tes.)
Lectura: En la Universitat Pompeu Fabra ( España ) en 2018
Idioma: español
Tribunal Calificador de la Tesis: M. Aranzazu Diaz de Ilarraza Sanchez (presid.), Mireia Vargas Urpí (secret.), Juliano Desiderato Antonio (voc.)
Programa de doctorado: Programa de Doctorado en Traducción y Ciencias del Lenguaje por la Universidad Pompeu Fabra
Materias:
- Ciencias tecnológicas
Enlaces
- Tesis en acceso abierto en: TDX
Resumen
- As one of the essential elements for Natural Language Processing (NLP), discourse has called much attention during recent years. Many studies explore the role of how discourse elements affect in different NLP research areas, such as parsing, sentiment analysis, machine translation evaluation, among others. Besides, along with the discourse analysis development, different treebanks annotated with discourse information for different languages form a great contribution for advancing the NLP researches.
  
  Spanish and Chinese are two of the most spoken languages in the world; the language pair occupy an important position for NLP studies. Therefore, this study aims to make a discourse analysis between the two languages in terms of annotating discourse similarities and differences under the theoretical framework of Rhetorical Structure Theory (RST) by Mann and Thompson (1988).
  
  Our goal, which is the main objective of this study, based on the annotation results, the study seeks to develop a protocol that includes recommendations for Spanish-Chinese translation. In addition, with a globalized context in the current society, the communication between Spanish and Chinese is more and more intensive. Therefore, another intention of our study is to develop some resources for the language learning between Spanish-Chinese.
  
  To achieve our goals, for the development of the protocol, we firstly establish a Spanish-Chinese parallel corpus and annotate the discourse information of the entire corpus. Then we evaluate the annotation results following a qualitative method to guarantee the high quality of the annotation results. Lastly, we conclude the discourse similarities and differences to make the protocol. Regarding the language learning between the two languages, we fully use the manually annotated discourse markers (DM) to develop a question-answering module.
  
  In recent years, there have been few contrastive works of Spanish and Chinese for discourse analysis. Therefore, this PhD study aims to partially fill a knowledge gap in the study between Spanish and Chinese.

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Coordinado por: