Ayuda
Ir al contenido

Dialnet


Resumen de Peak annotation and data analysis software tools for mass spectrometry imaging

Lluc Semente Fernandez

  • español

    La metabolómica espacial es la disciplina que estudia las imágenes de las distribuciones de compuestos químicos de bajo peso (metabolitos) en la superficie de los tejidos biológicos para revelar interacciones entre moléculas. Las imágenes de espectrometría de masas (MSI) es actualmente la principal técnica para obtener información de imágenes moleculares para la metabolómica espacial. MSI es una tecnología de imágenes moleculares sin marcador que produce espectros de masas que conservan las estructuras espaciales de las muestras de tejido. Esto se logra ionizando pequeñas porciones de una muestra (un píxel) en un ráster definido a través de toda su superficie, lo que da como resultado una colección de imágenes de distribución de iones (registradas como relaciones masa-carga (m/z)) sobre la muestra. Esta tesis tiene como objetivo desarrollar herramientas computacionales para la anotación de picos en MSI y en el diseño de flujos de trabajo para el análisis estadístico y multivariado de datos MSI, incluida la segmentación espacial. El trabajo realizado en esta tesis se puede separar claramente en dos partes. En primer lugar, el desarrollo de una herramienta de anotación de picos de isótopos y aductos adecuada para facilitar la identificación de compuestos de bajo rango de masa. Ahora podemos encontrar fácilmente iones monoisotópicos en nuestros conjuntos de datos MSI gracias al paquete de software rMSIannotation.

  • català

    La metabolòmica espacial és la disciplina que estudia les imatges de les distribucions de compostos químics de baix pes (metabòlits) a la superfície dels teixits biològics per revelar interaccions entre molècules. La imatge d'espectrometria de masses (MSI) és actualment la tècnica principal per obtenir informació d'imatges moleculars per a la metabolòmica espacial. MSI és una tecnologia d'imatges moleculars sense marcador que produeix espectres de masses que conserven les estructures espacials de les mostres de teixit. Això s'aconsegueix ionitzant petites porcions d'una mostra (un píxel) en un ràster definit a través de tota la seva superfície, cosa que dona com a resultat una col·lecció d'imatges de distribució de ions (registrades com a relacions massa-càrrega (m/z)) sobre la mostra. Aquesta tesi té com a objectius desenvolupar eines computacionals per a l'anotació de pics de MSI i el disseny de fluxos de treball per a l'anàlisi estadística i multivariant de dades MSI, inclosa la segmentació espacial. El treball realitzat en aquesta tesi es pot separar clarament en dues parts. En primer lloc, el desenvolupament d'una eina d'anotació de pics d'isòtops i adductes adequada per facilitar la identificació de compostos de rang de massa baix. Ara podem trobar fàcilment ions monoisotòpics als nostres conjunts de dades MSI gràcies al paquet de programari rMSIannotation. En segon lloc, el desenvolupament de eines de programari per a l’anàlisi de dades i la segmentació espacial basades en soft clustering per a dades MSI.

  • English

    Spatial metabolomics is the discipline that studies the images of the distributions of low weight chemical compounds (metabolites) on the surface of biological tissues to unveil interactions between molecules. Mass spectrometry imaging (MSI) is currently the principal technique to get molecular imaging information for spatial metabolomics. MSI is a label-free molecular imaging technology that produces mass spectra preserving the spatial structures of tissue samples. This is achieved by ionizing small portions of a sample (a pixel) in a defined raster through all its surface, which results in a collection of ion distribution images (registered as mass-to-charge ratios (m/z)) over the sample. This thesis is aimed to develop computational tools for peak annotation in MSI and in the design of workflows for the statistical and multivariate analysis of MSI data, including spatial segmentation. The work carried out in this thesis can be clearly separated in two parts. Firstly, the development of an isotope and adduct peak annotation tool suited to facilitate the identification of the low mass range compounds. Secondly, the development of software tools for data analysis and spatial segmentation based on soft clustering for MSI data. All the developed algorithms have been implemented in software tools using the R platform, in continuations of rMSI and rMSIproc, since R is open and widely spread across biodata analysts. Nevertheless, we complement R code with C++ language to enable efficient memory control and faster execution of iterative algorithms. All the tools developed for this thesis are released under the general public license (GPL) to facilitate the exchange of ideas and collaboration between the MSI community.

    The identification of the molecular formula and/or the chemical structure of the compounds in an MSI dataset is very challenging because usually there is only available the m/z exact mass. Therefore, new methods for the identifying and reporting molecular annotations in the low mass range for spatial metabolomics studies are required. To do so, we first aimed to localize which peaks were monoisotopic ions, as the searches on compounds libraries are based on the monoisotopic ion of the molecule.

    Additionally, we aimed to find groups of monoisotopic ions differing only with the adduct ion to determine neutral masses for the compounds, reducing the number of hits in compound libraries. However, we noticed that many compounds were hard to find in libraries like the Human Metabolome Database as they are based on LC-MS data, and some compounds depending on the study might not be included in some libraries, like specific metabolites of algae. Therefore, developing a tool to match libraries into the data was not an option, and by that time the METASPACE tool, which successfully uses this strategy, already existed, so we did not want to reinvent the wheel. Instead, we opted for a completely new strategy which does not require searching on libraries at runtime. We developed a general rule of carbon-based isotropic patterns of the family of compounds under interest (metabolites and lipids) and compared it with spectral data. As a difference with LC-MS annotation methods, the higher number of observations (i.e., pixels) usually found in MSI gives statistical power to the results and allows to use the ratios between isotopic peaks as a key variable for monoisotopic peaks annotation. The result of this research was the development of the peak annotation tool rMSIannotation. rMSIannotation is useful for annotation of compounds and variable reduction strategies; and can be integrated in any low-weight compounds MSI data analysis workflows. The results show that rMSIannotation automatically extracts valuable information from both high (TOF) and ultra-high (FT-ICR) resolution spectrometers. The presented algorithm demonstrated a high performance and annotation confidence when compared to the established metabolomics MSI annotation platform: METASPACE and to the manual annotation approaches.

    The huge number of ions in a MSI requires automatizing the report of ions of interest between different regions in spatial metabolomics studies. This is challenging as we had a big number of ions at each experiment and different spatial segmentation solutions, which resulted in a big number of combinations of possible results. Additionally, as some ions have very low intensity values in some pixels of the clusters, classical parametric statistical tests failed. To overcome this we developed a workflow using nonparametric tests and the percentage of pixels in which a particular ion is not detected This work resulted in the publication of the R package rMSIKeyIon. The tool is very effective at discovering up or down-regulated ions between clusters using an unsupervised k-means procedure. The ions selected are the candidates that, subsequently, have to be identified. This package is a valuable tool for the untargeted analysis of MALDI images and is an important advance in this area because, at present, there are no tools available.

    The state of the art of the segmentation techniques used in MSI considers that any pixel must be included only in one cluster (hard clustering). This is clearly a limitation, because in histology we find many transition regions between histological areas that are not captured by the clustering algorithms. Besides, it is known that it is very difficult to assess the performance of the hard clustering algorithms. These facts suggested the possibility of ranking the pixels in a cluster by similarity to the cluster prototype, with the objective of differentiating between pixels localized in homogenous regions and in transition regions from the point of view of histological areas. In this regard, we propose a soft/fuzzy clustering approach, a particular subset of clustering algorithms that could associate all clusters to a pixel in different degrees. We followed the trail of soft clustering in MSI, and we found that the Fuzzy c-means algorithm was not used in this context. Therefore, we researched the use of this soft clustering method as a possible way of ranking pixels for MSI data. From the study we conclude that fuzzy c-means brings additional information to MSI data analysis through the dimension of membership, which allows for new ways of interpreting the results compared with hard clustering results. In our case, the study of membership through the newly developed PFS allowed easy selection of the pixels more related to a cluster, unveiled morphological regions more challenging to detect, and enhanced a tissue type classification workflow in multiple samples of a human head and neck cancer dataset.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus