Ayuda
Ir al contenido

Dialnet


Contributions and applications around low resource deep learning modeling

  • Autores: Ivan Valles Perez
  • Directores de la Tesis: Emilio Soria Olivas (dir. tes.)
  • Lectura: En la Universitat de València ( España ) en 2023
  • Idioma: inglés
  • Tribunal Calificador de la Tesis: José Antonio Gámez Martín (presid.), Julia Amorós López (secret.), Ignacio José Díaz Blanco (voc.)
  • Programa de doctorado: Programa de Doctorado en Ingeniería Electrónica por la Universitat de València (Estudi General)
  • Materias:
  • Enlaces
  • Resumen
    • Deep learning is the state of the art for several machine learning tasks. Many of these tasks require large amount of computational resources, which limits their adoption in embedded devices. The main goal of this dissertation is to study methods and algorithms that allow to approach problems using deep learning with restricted computational resources. This work also aims at presenting applications of deep learning in industry.

      The first contribution is a new activation function for deep learning networks: the \textit{modulus} function. The experiments show that the proposed activation function achieves superior results in computer vision tasks when compared with the alternatives found in the literature.

      The second contribution is a new strategy to combine pre-trained models using knowledge distillation. The results of this chapter show that it is possible to significantly increase the accuracy of the smallest pre-trained models, allowing high performance at a lower computational cost.

      The following contribution in this thesis tackles the problem of sales forecasting in the field of logistics. Two end-to-end systems with two different deep learning techniques (sequence-to-sequence models and transformers) are proposed. The results of this chapter conclude that it is possible to build end-to-end systems to predict the sales of multiple individual products, at multiple points of sale and different times with a single machine learning model. The proposed model outperforms the alternatives found in the literature.

      Finally, the last two contributions belong to the speech technology field. The former, studies how to build a \textit{Keyword Spotting} speech recognition system using an efficient version of a convolutional neural network. In this study, the proposed system is able to beat the performance of all the benchmarks found in the literature when tested against the most complex subtasks.

      The latter study proposes a standalone state-of-the-art \textit{text-to-speech} model capable of synthesizing intelligible voice in thousands of voice profiles, while generating speech with meaningful and expressive prosody variations. The proposed approach removes the dependency of previous models on an additional voice system, which makes the proposed system more efficient at training and inference time, and enables offline and on-device operations.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno