Extracció d'informació de documents semi-estructurats amb models neurals

Autores: Manuel Carbonell Nuñez
Directores de la Tesis: Josep Llados Canet (dir. tes.), Alicia Fornés Bisquerra (codir. tes.), Mauricio Villegas Santamaría (codir. tes.)
Lectura: En la Universitat Autònoma de Barcelona ( España ) en 2020
Idioma: español
Tribunal Calificador de la Tesis: Joan Andreu Sánchez Peiró (presid.), Dimosthenis Karatzas (secret.), Anjan Dutta (voc.)
Programa de doctorado: Programa de Doctorado en Informática por la Universidad Autónoma de Barcelona
Materias:
- Matemáticas
  - Ciencia de los ordenadores
    - Inteligencia artificial
Enlaces
- Tesis en acceso abierto en: TDX
Resumen
- Sectors as fintech, legaltech or insurance process an inflow of million of forms, invoices, id documents, claims or similar every day. The success in the automation of these transactions depends on the ability to correctly digitize the textual content as well as to incorporate semantic understanding. This procedure, known as information extraction (IE) comprises the steps of localizing and recognizing text, identifying named entities contained in it and optionally finding relationships among its elements. In this work we explore multi-task neural models at image and graph level to solve all steps in a unified way. While doing so we find benefits and limitations of these end-to-end approaches in comparison with sequential separate methods.

Acceso de usuarios registrados

¿Es nuevo? Regístrese

Coordinado por: