PyMS: a Python toolkit for processing of gas chromatography-mass spectrometry (GC-MS) data. Application and comparative study of selected tools

<p>Abstract</p> <p>Background</p> <p>Gas chromatography–mass spectrometry (GC-MS) is a technique frequently used in targeted and non-targeted measurements of metabolites. Most existing software tools for processing of raw instrument GC-MS data tightly integrate data pro...

Full description

Bibliographic Details
Main Authors: O'Callaghan Sean, De Souza David P, Isaac Andrew, Wang Qiao, Hodkinson Luke, Olshansky Moshe, Erwin Tim, Appelbe Bill, Tull Dedreia L, Roessner Ute, Bacic Antony, McConville Malcolm J, Likić Vladimir A
Format: Article
Language:English
Published: BMC 2012-05-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/13/115
id doaj-cc189b7218524b72926ffbbe8d8a668b
record_format Article
spelling doaj-cc189b7218524b72926ffbbe8d8a668b2020-11-25T00:19:21ZengBMCBMC Bioinformatics1471-21052012-05-0113111510.1186/1471-2105-13-115PyMS: a Python toolkit for processing of gas chromatography-mass spectrometry (GC-MS) data. Application and comparative study of selected toolsO'Callaghan SeanDe Souza David PIsaac AndrewWang QiaoHodkinson LukeOlshansky MosheErwin TimAppelbe BillTull Dedreia LRoessner UteBacic AntonyMcConville Malcolm JLikić Vladimir A<p>Abstract</p> <p>Background</p> <p>Gas chromatography–mass spectrometry (GC-MS) is a technique frequently used in targeted and non-targeted measurements of metabolites. Most existing software tools for processing of raw instrument GC-MS data tightly integrate data processing methods with graphical user interface facilitating interactive data processing. While interactive processing remains critically important in GC-MS applications, high-throughput studies increasingly dictate the need for command line tools, suitable for scripting of high-throughput, customized processing pipelines.</p> <p>Results</p> <p>PyMS comprises a library of functions for processing of instrument GC-MS data developed in Python. PyMS currently provides a complete set of GC-MS processing functions, including reading of standard data formats (ANDI- MS/NetCDF and JCAMP-DX), noise smoothing, baseline correction, peak detection, peak deconvolution, peak integration, and peak alignment by dynamic programming. A novel common ion single quantitation algorithm allows automated, accurate quantitation of GC-MS electron impact (EI) fragmentation spectra when a large number of experiments are being analyzed. PyMS implements parallel processing for by-row and by-column data processing tasks based on Message Passing Interface (MPI), allowing processing to scale on multiple CPUs in distributed computing environments. A set of specifically designed experiments was performed in-house and used to comparatively evaluate the performance of PyMS and three widely used software packages for GC-MS data processing (AMDIS, AnalyzerPro, and XCMS).</p> <p>Conclusions</p> <p>PyMS is a novel software package for the processing of raw GC-MS data, particularly suitable for scripting of customized processing pipelines and for data processing in batch mode. PyMS provides limited graphical capabilities and can be used both for routine data processing and interactive/exploratory data analysis. In real-life GC-MS data processing scenarios PyMS performs as well or better than leading software packages. We demonstrate data processing scenarios simple to implement in PyMS, yet difficult to achieve with many conventional GC-MS data processing software. Automated sample processing and quantitation with PyMS can provide substantial time savings compared to more traditional interactive software systems that tightly integrate data processing with the graphical user interface.</p> http://www.biomedcentral.com/1471-2105/13/115
collection DOAJ
language English
format Article
sources DOAJ
author O'Callaghan Sean
De Souza David P
Isaac Andrew
Wang Qiao
Hodkinson Luke
Olshansky Moshe
Erwin Tim
Appelbe Bill
Tull Dedreia L
Roessner Ute
Bacic Antony
McConville Malcolm J
Likić Vladimir A
spellingShingle O'Callaghan Sean
De Souza David P
Isaac Andrew
Wang Qiao
Hodkinson Luke
Olshansky Moshe
Erwin Tim
Appelbe Bill
Tull Dedreia L
Roessner Ute
Bacic Antony
McConville Malcolm J
Likić Vladimir A
PyMS: a Python toolkit for processing of gas chromatography-mass spectrometry (GC-MS) data. Application and comparative study of selected tools
BMC Bioinformatics
author_facet O'Callaghan Sean
De Souza David P
Isaac Andrew
Wang Qiao
Hodkinson Luke
Olshansky Moshe
Erwin Tim
Appelbe Bill
Tull Dedreia L
Roessner Ute
Bacic Antony
McConville Malcolm J
Likić Vladimir A
author_sort O'Callaghan Sean
title PyMS: a Python toolkit for processing of gas chromatography-mass spectrometry (GC-MS) data. Application and comparative study of selected tools
title_short PyMS: a Python toolkit for processing of gas chromatography-mass spectrometry (GC-MS) data. Application and comparative study of selected tools
title_full PyMS: a Python toolkit for processing of gas chromatography-mass spectrometry (GC-MS) data. Application and comparative study of selected tools
title_fullStr PyMS: a Python toolkit for processing of gas chromatography-mass spectrometry (GC-MS) data. Application and comparative study of selected tools
title_full_unstemmed PyMS: a Python toolkit for processing of gas chromatography-mass spectrometry (GC-MS) data. Application and comparative study of selected tools
title_sort pyms: a python toolkit for processing of gas chromatography-mass spectrometry (gc-ms) data. application and comparative study of selected tools
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2012-05-01
description <p>Abstract</p> <p>Background</p> <p>Gas chromatography–mass spectrometry (GC-MS) is a technique frequently used in targeted and non-targeted measurements of metabolites. Most existing software tools for processing of raw instrument GC-MS data tightly integrate data processing methods with graphical user interface facilitating interactive data processing. While interactive processing remains critically important in GC-MS applications, high-throughput studies increasingly dictate the need for command line tools, suitable for scripting of high-throughput, customized processing pipelines.</p> <p>Results</p> <p>PyMS comprises a library of functions for processing of instrument GC-MS data developed in Python. PyMS currently provides a complete set of GC-MS processing functions, including reading of standard data formats (ANDI- MS/NetCDF and JCAMP-DX), noise smoothing, baseline correction, peak detection, peak deconvolution, peak integration, and peak alignment by dynamic programming. A novel common ion single quantitation algorithm allows automated, accurate quantitation of GC-MS electron impact (EI) fragmentation spectra when a large number of experiments are being analyzed. PyMS implements parallel processing for by-row and by-column data processing tasks based on Message Passing Interface (MPI), allowing processing to scale on multiple CPUs in distributed computing environments. A set of specifically designed experiments was performed in-house and used to comparatively evaluate the performance of PyMS and three widely used software packages for GC-MS data processing (AMDIS, AnalyzerPro, and XCMS).</p> <p>Conclusions</p> <p>PyMS is a novel software package for the processing of raw GC-MS data, particularly suitable for scripting of customized processing pipelines and for data processing in batch mode. PyMS provides limited graphical capabilities and can be used both for routine data processing and interactive/exploratory data analysis. In real-life GC-MS data processing scenarios PyMS performs as well or better than leading software packages. We demonstrate data processing scenarios simple to implement in PyMS, yet difficult to achieve with many conventional GC-MS data processing software. Automated sample processing and quantitation with PyMS can provide substantial time savings compared to more traditional interactive software systems that tightly integrate data processing with the graphical user interface.</p>
url http://www.biomedcentral.com/1471-2105/13/115
work_keys_str_mv AT ocallaghansean pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT desouzadavidp pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT isaacandrew pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT wangqiao pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT hodkinsonluke pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT olshanskymoshe pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT erwintim pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT appelbebill pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT tulldedreial pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT roessnerute pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT bacicantony pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT mcconvillemalcolmj pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
AT likicvladimira pymsapythontoolkitforprocessingofgaschromatographymassspectrometrygcmsdataapplicationandcomparativestudyofselectedtools
_version_ 1725371928854659072