CHAPTER 16: Python in Proteomics
Published:16 Mar 2020
Python is a versatile scripting language that is widely used in bioinformatics and has generated increasing interest in the proteomics and mass spectrometry community. Computing and data analysis in mass spectrometry is very diverse and in many cases must be tailored to a specific experiment. This makes Python an excellent programming language for the task due to its flexibility, visualization capabilities and large number of powerful libraries. Python can be used to quickly prototype software, combine existing libraries into powerful analysis workflows while avoiding the trap of re-inventing the wheel for a new project. Here, we will describe data analysis and software prototyping of mass spectrometric data with Python using the pyOpenMS package. pyOpenMS is an open-source Python library for mass spectrometry, specifically built for the analysis of proteomics and metabolomics data in Python. pyOpenMS facilitates the execution of common tasks in protoemics (and other mass spectrometric fields) such as file handling, chemistry (mass calculation, peptide fragmentation, isotopic abundances), signal processing (smoothing, filtering, de-isotoping, retention time correction and peak-picking), identification analysis (including peptide search, PTM analysis, cross-linked analytes, FDR control, RNA oligonucleotide search and small molecule search tools), quantitative analysis (including label-free, metabolomics, SILAC, iTRAQ and SWATH/DIA analysis tools), chromatogram analysis (chromatographic peak picking, smoothing, elution profiles and peak scoring for SRM/MRM/PRM/SWATH/DIA data) as well as providing an interface for interacting with common tools in proteomics and metabolomics.