Ayuda
Ir al contenido

Dialnet


Resumen de Analysis and design of scalable pre-processing techniques of instances for imbalanced Big Data problems. Applications in humanitarian emergencies situations

María José Basgall

  • This thesis addresses the distributed and scalable pre-processing of Big Data sets, in order to obtain good quality data, known as Smart Data. Particularly, it focuses on classification problems, and on addressing the following characteristics: (a) imbalanced data; (b) redundancy; (c) high dimensionality; and (d) overlapping.

    The following specific objectives are established for the aforementioned purpose:

    -Enable a state-of-the-art algorithm widely used for the treatment of class imbalance in traditional data scenarios (Small Data), to be able to obtain adequate results from large datasets in a distributed manner and in reasonable execution times.

    -To design and to implement a fast and scalable methodology for the reduction in both instances and attributes for Big Data sets with high redundancy and dimensionality, while maintaining the predictive capacity of the original dataset.

    -To design and to implement a strategy for scalable data characterisation in the context of Big Data classification, focusing on the ambiguous areas of the problem.

    -To apply the knowledge acquired during the development phase to solve problems of interest related to humanitarian emergencies.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus