Processing Metabolomics and Proteomics Data with Open Software: A Practical Guide
CHAPTER 10: Modular metaX Pipeline for Processing Untargeted Metabolomics Data1
-
Published:16 Mar 2020
-
Special Collection: 2020 ebook collection
Bo Wen, 2020. "Modular metaX Pipeline for Processing Untargeted Metabolomics Data1", Processing Metabolomics and Proteomics Data with Open Software: A Practical Guide, Robert Winkler
Download citation file:
Mass spectrometry coupled with either liquid chromatography (LC-MS) or gas chromatography (GC-MS) has become a popular and powerful technique in metabolomics studies, which aims at comprehensive profiling of all small molecule metabolites (<1500 Da) in biological systems. The global metabolomics technique, also known as untargeted metabolomics, typically generates large datasets with thousands of signals including true biological signals from metabolites as well as noise signals from contaminants and artifacts. Moreover, signal drift and batch effect are frequently encountered in large-scale untargeted metabolomics studies. Computationally intensive processing and analyses are required to handle these issues and generate biologically meaningful results.
The steps involved in analyzing MS-based metabolomics data usually include peak picking, quality control, data cleaning, pre-processing, univariate and multivariate statistical analysis, and data visualization. Many open-source tools and algorithms have been developed for various aspects of the metabolomics data analysis pipeline.1–4 Some of them cover limited processing steps while others offer comprehensive pipelines for metabolomics data analysis. An overview of software and methods for metabolomics data analysis can be found in recent reviews.1–3 However, because metabolomics studies cover a wide range of purposes and experimental methods, the data analysis steps used for one study may differ from those used for another.5