Deriving Lipid Classification Based on Molecular Formulas

Despite instrument and algorithmic improvements, the untargeted and accurate assignment of metabolites remains an unsolved problem in metabolomics. New assignment methods such as our SMIRFE algorithm can assign elemental molecular formulas to observed spectral features in a highly untargeted manner...

Full description

Bibliographic Details
Main Authors:	Joshua M. Mitchell, Robert M. Flight, Hunter N.B. Moseley
Format:	Article
Language:	English
Published:	MDPI AG 2020-03-01
Series:	Metabolites
Subjects:	smirfe lipidomics metabolomics lipid category machine learning random forest
Online Access:	https://www.mdpi.com/2218-1989/10/3/122

id	doaj-a1a512dee1d04d3caa4ebd5b051cf4ab
record_format	Article
spelling	doaj-a1a512dee1d04d3caa4ebd5b051cf4ab2020-11-25T01:37:46ZengMDPI AGMetabolites2218-19892020-03-0110312210.3390/metabo10030122metabo10030122Deriving Lipid Classification Based on Molecular FormulasJoshua M. Mitchell0Robert M. Flight1Hunter N.B. Moseley2Department of Molecular & Cellular Biochemistry, University of Kentucky, Lexington, KY 40536, USADepartment of Molecular & Cellular Biochemistry, University of Kentucky, Lexington, KY 40536, USADepartment of Molecular & Cellular Biochemistry, University of Kentucky, Lexington, KY 40536, USADespite instrument and algorithmic improvements, the untargeted and accurate assignment of metabolites remains an unsolved problem in metabolomics. New assignment methods such as our SMIRFE algorithm can assign elemental molecular formulas to observed spectral features in a highly untargeted manner without orthogonal information from tandem MS or chromatography. However, for many lipidomics applications, it is necessary to know at least the lipid category or class that is associated with a detected spectral feature to derive a biochemical interpretation. Our goal is to develop a method for robustly classifying elemental molecular formula assignments into lipid categories for an application to SMIRFE-generated assignments. Using a Random Forest machine learning approach, we developed a method that can predict lipid category and class from SMIRFE non-adducted molecular formula assignments. Our methods achieve high average predictive accuracy (>90%) and precision (>83%) across all eight of the lipid categories in the LIPIDMAPS database. Classification performance was evaluated using sets of theoretical, data-derived, and artifactual molecular formulas. Our methods enable the lipid classification of non-adducted molecular formula assignments generated by SMIRFE without orthogonal information, facilitating the biochemical interpretation of untargeted lipidomics experiments. This lipid classification appears insufficient for validating single-spectrum assignments, but could be useful in cross-spectrum assignment validation.https://www.mdpi.com/2218-1989/10/3/122smirfelipidomicsmetabolomicslipid categorymachine learningrandom forest
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Joshua M. Mitchell Robert M. Flight Hunter N.B. Moseley
spellingShingle	Joshua M. Mitchell Robert M. Flight Hunter N.B. Moseley Deriving Lipid Classification Based on Molecular Formulas Metabolites smirfe lipidomics metabolomics lipid category machine learning random forest
author_facet	Joshua M. Mitchell Robert M. Flight Hunter N.B. Moseley
author_sort	Joshua M. Mitchell
title	Deriving Lipid Classification Based on Molecular Formulas
title_short	Deriving Lipid Classification Based on Molecular Formulas
title_full	Deriving Lipid Classification Based on Molecular Formulas
title_fullStr	Deriving Lipid Classification Based on Molecular Formulas
title_full_unstemmed	Deriving Lipid Classification Based on Molecular Formulas
title_sort	deriving lipid classification based on molecular formulas
publisher	MDPI AG
series	Metabolites
issn	2218-1989
publishDate	2020-03-01
description	Despite instrument and algorithmic improvements, the untargeted and accurate assignment of metabolites remains an unsolved problem in metabolomics. New assignment methods such as our SMIRFE algorithm can assign elemental molecular formulas to observed spectral features in a highly untargeted manner without orthogonal information from tandem MS or chromatography. However, for many lipidomics applications, it is necessary to know at least the lipid category or class that is associated with a detected spectral feature to derive a biochemical interpretation. Our goal is to develop a method for robustly classifying elemental molecular formula assignments into lipid categories for an application to SMIRFE-generated assignments. Using a Random Forest machine learning approach, we developed a method that can predict lipid category and class from SMIRFE non-adducted molecular formula assignments. Our methods achieve high average predictive accuracy (>90%) and precision (>83%) across all eight of the lipid categories in the LIPIDMAPS database. Classification performance was evaluated using sets of theoretical, data-derived, and artifactual molecular formulas. Our methods enable the lipid classification of non-adducted molecular formula assignments generated by SMIRFE without orthogonal information, facilitating the biochemical interpretation of untargeted lipidomics experiments. This lipid classification appears insufficient for validating single-spectrum assignments, but could be useful in cross-spectrum assignment validation.
topic	smirfe lipidomics metabolomics lipid category machine learning random forest
url	https://www.mdpi.com/2218-1989/10/3/122
work_keys_str_mv	AT joshuammitchell derivinglipidclassificationbasedonmolecularformulas AT robertmflight derivinglipidclassificationbasedonmolecularformulas AT hunternbmoseley derivinglipidclassificationbasedonmolecularformulas
_version_	1725057524223180800

Deriving Lipid Classification Based on Molecular Formulas

Similar Items