Automated supervised learning pipeline for non-targeted GC-MS data analysis
Non-targeted analysis is nowadays applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. Conventional processing strategies for GC-MS data include baseline correction, feature detection, and retention time alignment before multivariate modeli...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2019-03-01
|
Series: | Analytica Chimica Acta: X |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2590134619300015 |
id |
doaj-298a96c2f93a441a89f70a55b6e38ca1 |
---|---|
record_format |
Article |
spelling |
doaj-298a96c2f93a441a89f70a55b6e38ca12020-11-24T21:32:20ZengElsevierAnalytica Chimica Acta: X2590-13462019-03-011Automated supervised learning pipeline for non-targeted GC-MS data analysisKimmo Sirén0Ulrich Fischer1Jochen Vestner2Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435, Neustadt, Germany; Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663, Kaiserslautern, GermanyInstitute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435, Neustadt, GermanyInstitute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435, Neustadt, Germany; Corresponding author.Non-targeted analysis is nowadays applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. Conventional processing strategies for GC-MS data include baseline correction, feature detection, and retention time alignment before multivariate modeling. These techniques can be prone to errors and therefore time-consuming manual corrections are generally necessary. We introduce here a novel fully automated approach to non-targeted GC-MS data processing. This new approach avoids feature extraction and retention time alignment. Supervised machine learning on decomposed tensors of segmented chromatographic raw data signal is used to rank regions in the chromatograms contributing to differentiation between sample classes. The performance of this novel data analysis approach is demonstrated on three published datasets. Keywords: Metabolomics, Chemometrics, Tensor decomposition, Machine learning, Classification, Exploratory data analysishttp://www.sciencedirect.com/science/article/pii/S2590134619300015 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Kimmo Sirén Ulrich Fischer Jochen Vestner |
spellingShingle |
Kimmo Sirén Ulrich Fischer Jochen Vestner Automated supervised learning pipeline for non-targeted GC-MS data analysis Analytica Chimica Acta: X |
author_facet |
Kimmo Sirén Ulrich Fischer Jochen Vestner |
author_sort |
Kimmo Sirén |
title |
Automated supervised learning pipeline for non-targeted GC-MS data analysis |
title_short |
Automated supervised learning pipeline for non-targeted GC-MS data analysis |
title_full |
Automated supervised learning pipeline for non-targeted GC-MS data analysis |
title_fullStr |
Automated supervised learning pipeline for non-targeted GC-MS data analysis |
title_full_unstemmed |
Automated supervised learning pipeline for non-targeted GC-MS data analysis |
title_sort |
automated supervised learning pipeline for non-targeted gc-ms data analysis |
publisher |
Elsevier |
series |
Analytica Chimica Acta: X |
issn |
2590-1346 |
publishDate |
2019-03-01 |
description |
Non-targeted analysis is nowadays applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. Conventional processing strategies for GC-MS data include baseline correction, feature detection, and retention time alignment before multivariate modeling. These techniques can be prone to errors and therefore time-consuming manual corrections are generally necessary. We introduce here a novel fully automated approach to non-targeted GC-MS data processing. This new approach avoids feature extraction and retention time alignment. Supervised machine learning on decomposed tensors of segmented chromatographic raw data signal is used to rank regions in the chromatograms contributing to differentiation between sample classes. The performance of this novel data analysis approach is demonstrated on three published datasets. Keywords: Metabolomics, Chemometrics, Tensor decomposition, Machine learning, Classification, Exploratory data analysis |
url |
http://www.sciencedirect.com/science/article/pii/S2590134619300015 |
work_keys_str_mv |
AT kimmosiren automatedsupervisedlearningpipelinefornontargetedgcmsdataanalysis AT ulrichfischer automatedsupervisedlearningpipelinefornontargetedgcmsdataanalysis AT jochenvestner automatedsupervisedlearningpipelinefornontargetedgcmsdataanalysis |
_version_ |
1725958115463004160 |