Ayuda
Ir al contenido

Dialnet


Challenges of combining structured and unstructured data in corpus development

    1. [1] University of Helsinki

      University of Helsinki

      Helsinki, Finlandia

    2. [2] Linnaeus University

      Linnaeus University

      Suecia

  • Localización: Research in Corpus Linguistics (RiCL), ISSN-e 2243-4712, Vol. 9, Nº. Extra 1, 2021 (Ejemplar dedicado a: "Challenges of combining structured and unstructured data in corpus development"), págs. 1-8
  • Idioma: inglés
  • Enlaces
  • Resumen
    • Recent advances in the availability of ever larger and more varied electronic datasets, both historical and modern, provide unprecedented opportunities for corpus linguistics and the digital humanities. However, combining unstructured text with images, video, audio as well as structured metadata poses a variety of challenges to corpus compilers. This paper presents an overview of the topic to contextualise this special issue of Research in Corpus Linguistics. The aim of the special issue is to highlight some of the challenges faced and solutions developed in several recent and ongoing corpus projects. Rather than providing overall descriptions of corpora, each contributor discusses specific challenges they faced in the corpus development process, summarised in this paper. We hope that the special issue will benefit future corpus projects by providing solutions to common problems and by paving the way for new best practices for the compilation and development of rich-data corpora. We also hope that this collection of articles will help keep the conversation going on the theoretical and methodological challenges of corpus compilation


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno