Deriving Lipid Classification Based on Molecular Formulas
Despite instrument and algorithmic improvements, the untargeted and accurate assignment of metabolites remains an unsolved problem in metabolomics. New assignment methods such as our SMIRFE algorithm can assign elemental molecular formulas to observed spectral features in a highly untargeted manner...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-03-01
|
Series: | Metabolites |
Subjects: | |
Online Access: | https://www.mdpi.com/2218-1989/10/3/122 |
id |
doaj-a1a512dee1d04d3caa4ebd5b051cf4ab |
---|---|
record_format |
Article |
spelling |
doaj-a1a512dee1d04d3caa4ebd5b051cf4ab2020-11-25T01:37:46ZengMDPI AGMetabolites2218-19892020-03-0110312210.3390/metabo10030122metabo10030122Deriving Lipid Classification Based on Molecular FormulasJoshua M. Mitchell0Robert M. Flight1Hunter N.B. Moseley2Department of Molecular & Cellular Biochemistry, University of Kentucky, Lexington, KY 40536, USADepartment of Molecular & Cellular Biochemistry, University of Kentucky, Lexington, KY 40536, USADepartment of Molecular & Cellular Biochemistry, University of Kentucky, Lexington, KY 40536, USADespite instrument and algorithmic improvements, the untargeted and accurate assignment of metabolites remains an unsolved problem in metabolomics. New assignment methods such as our SMIRFE algorithm can assign elemental molecular formulas to observed spectral features in a highly untargeted manner without orthogonal information from tandem MS or chromatography. However, for many lipidomics applications, it is necessary to know at least the lipid category or class that is associated with a detected spectral feature to derive a biochemical interpretation. Our goal is to develop a method for robustly classifying elemental molecular formula assignments into lipid categories for an application to SMIRFE-generated assignments. Using a Random Forest machine learning approach, we developed a method that can predict lipid category and class from SMIRFE non-adducted molecular formula assignments. Our methods achieve high average predictive accuracy (>90%) and precision (>83%) across all eight of the lipid categories in the LIPIDMAPS database. Classification performance was evaluated using sets of theoretical, data-derived, and artifactual molecular formulas. Our methods enable the lipid classification of non-adducted molecular formula assignments generated by SMIRFE without orthogonal information, facilitating the biochemical interpretation of untargeted lipidomics experiments. This lipid classification appears insufficient for validating single-spectrum assignments, but could be useful in cross-spectrum assignment validation.https://www.mdpi.com/2218-1989/10/3/122smirfelipidomicsmetabolomicslipid categorymachine learningrandom forest |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Joshua M. Mitchell Robert M. Flight Hunter N.B. Moseley |
spellingShingle |
Joshua M. Mitchell Robert M. Flight Hunter N.B. Moseley Deriving Lipid Classification Based on Molecular Formulas Metabolites smirfe lipidomics metabolomics lipid category machine learning random forest |
author_facet |
Joshua M. Mitchell Robert M. Flight Hunter N.B. Moseley |
author_sort |
Joshua M. Mitchell |
title |
Deriving Lipid Classification Based on Molecular Formulas |
title_short |
Deriving Lipid Classification Based on Molecular Formulas |
title_full |
Deriving Lipid Classification Based on Molecular Formulas |
title_fullStr |
Deriving Lipid Classification Based on Molecular Formulas |
title_full_unstemmed |
Deriving Lipid Classification Based on Molecular Formulas |
title_sort |
deriving lipid classification based on molecular formulas |
publisher |
MDPI AG |
series |
Metabolites |
issn |
2218-1989 |
publishDate |
2020-03-01 |
description |
Despite instrument and algorithmic improvements, the untargeted and accurate assignment of metabolites remains an unsolved problem in metabolomics. New assignment methods such as our SMIRFE algorithm can assign elemental molecular formulas to observed spectral features in a highly untargeted manner without orthogonal information from tandem MS or chromatography. However, for many lipidomics applications, it is necessary to know at least the lipid category or class that is associated with a detected spectral feature to derive a biochemical interpretation. Our goal is to develop a method for robustly classifying elemental molecular formula assignments into lipid categories for an application to SMIRFE-generated assignments. Using a Random Forest machine learning approach, we developed a method that can predict lipid category and class from SMIRFE non-adducted molecular formula assignments. Our methods achieve high average predictive accuracy (>90%) and precision (>83%) across all eight of the lipid categories in the LIPIDMAPS database. Classification performance was evaluated using sets of theoretical, data-derived, and artifactual molecular formulas. Our methods enable the lipid classification of non-adducted molecular formula assignments generated by SMIRFE without orthogonal information, facilitating the biochemical interpretation of untargeted lipidomics experiments. This lipid classification appears insufficient for validating single-spectrum assignments, but could be useful in cross-spectrum assignment validation. |
topic |
smirfe lipidomics metabolomics lipid category machine learning random forest |
url |
https://www.mdpi.com/2218-1989/10/3/122 |
work_keys_str_mv |
AT joshuammitchell derivinglipidclassificationbasedonmolecularformulas AT robertmflight derivinglipidclassificationbasedonmolecularformulas AT hunternbmoseley derivinglipidclassificationbasedonmolecularformulas |
_version_ |
1725057524223180800 |