Prediction based on averages over automatically induced learners ensemble methods and Bayesian techniques

Daniel Hernández Lobato

Ayuda

Prediction based on averages over automatically induced learners ensemble methods and Bayesian techniques

Autores: Daniel Hernández Lobato
Directores de la Tesis: Alberto Suárez González (dir. tes.)
Lectura: En la Universidad Autónoma de Madrid ( España ) en 2009
Idioma: inglés
Tribunal Calificador de la Tesis: Juan Ramón Vidal Romaní (presid.), José Ramón Dorronsoro Ibero (secret.), Jesús Cid Sueiro (voc.), Vicente López Martínez (voc.), Hilbert Kappen (voc.), Grigorios Tsoumakas (voc.)
Materias:
- Matemáticas
Enlaces
- Tesis en acceso abierto en: Biblos-e Archivo
Resumen
- Ensemble methods and Bayesian techniques are two learning paradigms that can be useful to alleviate the di¿culties associated with automatic induction from a limited amount of data in the presence of noise. Instead of considering a single hypothesis for prediction, these methods take into account the outputs of a collection of hypotheses compatible with the observed data.
  
  Averaging the predictions of di¿erent learners provides a mechanism to produce more accurate and robust decisions. However, the practical use of ensembles and Bayesian techniques in machine learning presents some complications. Speci¿cally, ensemble methods have large storage requirements. The predictors of the ensemble need to be kept in memory so that they can be readily accessed. Furthermore, computing the ¿nal ensemble decision requires querying every predictor in the ensemble. Thus, the prediction cost increases linearly with the ensemble size.
  
  In general, it is also di¿cult to estimate an appropriate value for the size of the ensemble. On the other hand, Bayesian approaches require the evaluation of multi-dimensional integrals or summations with an exponentially large number of terms that are often intractable. In practice, these calculations are made using approximate algorithms that can be computationally expensive. This thesis addresses some of these shortcomings and proposes novel applications of ensemble methods and Bayesian techniques in supervised learning tasks of practical interest.
  
  In the ¿rst part of this thesis we analyze di¿erent pruning methods that reduce the memory requirements and prediction times of ensembles. These methods replace the original ensemble by a subensemble with good generalization properties. We show that identifying the subensemble that is optimal in terms of the training error is possible only in regression ensembles of intermediate size. For larger ensembles two approximate methods are analyzed: ordered aggregation and SDP-pruning. Both SDP-pruning and ordered aggregation select subensembles that outperform the original ensemble. In classi¿cation ensembles it is possible to make inference about the ¿nal ensemble prediction by querying only a fraction of the total classi¿ers in the ensemble. This is the basis of a novel ensemble pruning method: instance-based (IB) pruning. IB-pruning produces a large speed-up of the classi¿cation process without signi¿cantly deteriorating the generalization performance of the ensemble. This part of the thesis also describes a statistical procedure for determining an adequate size for the ensemble. The probabilistic framework introduced in IBpruning can be used to infer the size of a classi¿cation ensemble so that the resulting ensemble predicts the same class label as an ensemble of in¿nite size with a speci¿ed con¿dence level.
  
  The second part of this thesis proposes novel applications of Bayesian techniques with a focus on computational e¿ciency. Speci¿cally, the expectation propagation (EP) algorithm is used as an alternative to more computationally expensive methods such as Markov chain Monte Carlo or type-II maximum likelihood estimation. In this part of the thesis we introduce the Bayes machine for binary classi¿cation. In this Bayesian classi¿er the posterior distribution of a parameter that quanti¿es the level of noise in the class labels is inferred from the data.
  
  This posterior distribution can be e¿ciently approximated using the EP algorithm. When EP is used to compute the approximation, the Bayes machine does not require any re-training to estimate this parameter. The cost of training the Bayes machine can be further reduced using a sparse representation. This representation is found by a greedy algorithm whose performance is improved by considering additional re¿ning iterations. Finally, we show that EP can be used to approximate the posterior distribution of a Bayesian model for the classi¿cation of microarray data. The EP algorithm signi¿cantly reduces the training cost of this model and is useful to identify relevant genes for subsequent analysis.

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Coordinado por: