MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks

Abstract Background Mass spectrometry (MS) has become a promising analytical technique to acquire proteomics information for the characterization of biological samples. Nevertheless, most studies focus on the final proteins identified through a suite of algorithms by using partial MS spectra to comp...

Full description

Bibliographic Details
Main Authors: Shisheng Wang, Hongwen Zhu, Hu Zhou, Jingqiu Cheng, Hao Yang
Format: Article
Language:English
Published: BMC 2020-10-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-03783-0
id doaj-c7539a0edff947c288ef1112eb231b0e
record_format Article
spelling doaj-c7539a0edff947c288ef1112eb231b0e2020-11-25T02:45:44ZengBMCBMC Bioinformatics1471-21052020-10-0121111510.1186/s12859-020-03783-0MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networksShisheng Wang0Hongwen Zhu1Hu Zhou2Jingqiu Cheng3Hao Yang4West China-Washington Mitochondria and Metabolism Research Center; Key Lab of Transplant Engineering and Immu-Nology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan UniversityShanghai Institute of Materia Medica, Chinese Academy of SciencesShanghai Institute of Materia Medica, Chinese Academy of SciencesWest China-Washington Mitochondria and Metabolism Research Center; Key Lab of Transplant Engineering and Immu-Nology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan UniversityWest China-Washington Mitochondria and Metabolism Research Center; Key Lab of Transplant Engineering and Immu-Nology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan UniversityAbstract Background Mass spectrometry (MS) has become a promising analytical technique to acquire proteomics information for the characterization of biological samples. Nevertheless, most studies focus on the final proteins identified through a suite of algorithms by using partial MS spectra to compare with the sequence database, while the pattern recognition and classification of raw mass-spectrometric data remain unresolved. Results We developed an open-source and comprehensive platform, named MSpectraAI, for analyzing large-scale MS data through deep neural networks (DNNs); this system involves spectral-feature swath extraction, classification, and visualization. Moreover, this platform allows users to create their own DNN model by using Keras. To evaluate this tool, we collected the publicly available proteomics datasets of six tumor types (a total of 7,997,805 mass spectra) from the ProteomeXchange consortium and classified the samples based on the spectra profiling. The results suggest that MSpectraAI can distinguish different types of samples based on the fingerprint spectrum and achieve better prediction accuracy in MS1 level (average 0.967). Conclusion This study deciphers proteome profiling of raw mass spectrometry data and broadens the promising application of the classification and prediction of proteomics data from multi-tumor samples using deep learning methods. MSpectraAI also shows a better performance compared to the other classical machine learning approaches.http://link.springer.com/article/10.1186/s12859-020-03783-0Raw mass spectrometry dataProteome profilingFeature swath extractionDeep neural networksMulti-tumor typesLeave-one-out cross prediction strategy
collection DOAJ
language English
format Article
sources DOAJ
author Shisheng Wang
Hongwen Zhu
Hu Zhou
Jingqiu Cheng
Hao Yang
spellingShingle Shisheng Wang
Hongwen Zhu
Hu Zhou
Jingqiu Cheng
Hao Yang
MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks
BMC Bioinformatics
Raw mass spectrometry data
Proteome profiling
Feature swath extraction
Deep neural networks
Multi-tumor types
Leave-one-out cross prediction strategy
author_facet Shisheng Wang
Hongwen Zhu
Hu Zhou
Jingqiu Cheng
Hao Yang
author_sort Shisheng Wang
title MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks
title_short MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks
title_full MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks
title_fullStr MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks
title_full_unstemmed MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks
title_sort mspectraai: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2020-10-01
description Abstract Background Mass spectrometry (MS) has become a promising analytical technique to acquire proteomics information for the characterization of biological samples. Nevertheless, most studies focus on the final proteins identified through a suite of algorithms by using partial MS spectra to compare with the sequence database, while the pattern recognition and classification of raw mass-spectrometric data remain unresolved. Results We developed an open-source and comprehensive platform, named MSpectraAI, for analyzing large-scale MS data through deep neural networks (DNNs); this system involves spectral-feature swath extraction, classification, and visualization. Moreover, this platform allows users to create their own DNN model by using Keras. To evaluate this tool, we collected the publicly available proteomics datasets of six tumor types (a total of 7,997,805 mass spectra) from the ProteomeXchange consortium and classified the samples based on the spectra profiling. The results suggest that MSpectraAI can distinguish different types of samples based on the fingerprint spectrum and achieve better prediction accuracy in MS1 level (average 0.967). Conclusion This study deciphers proteome profiling of raw mass spectrometry data and broadens the promising application of the classification and prediction of proteomics data from multi-tumor samples using deep learning methods. MSpectraAI also shows a better performance compared to the other classical machine learning approaches.
topic Raw mass spectrometry data
Proteome profiling
Feature swath extraction
Deep neural networks
Multi-tumor types
Leave-one-out cross prediction strategy
url http://link.springer.com/article/10.1186/s12859-020-03783-0
work_keys_str_mv AT shishengwang mspectraaiapowerfulplatformfordecipheringproteomeprofilingofmultitumormassspectrometrydatabyusingdeepneuralnetworks
AT hongwenzhu mspectraaiapowerfulplatformfordecipheringproteomeprofilingofmultitumormassspectrometrydatabyusingdeepneuralnetworks
AT huzhou mspectraaiapowerfulplatformfordecipheringproteomeprofilingofmultitumormassspectrometrydatabyusingdeepneuralnetworks
AT jingqiucheng mspectraaiapowerfulplatformfordecipheringproteomeprofilingofmultitumormassspectrometrydatabyusingdeepneuralnetworks
AT haoyang mspectraaiapowerfulplatformfordecipheringproteomeprofilingofmultitumormassspectrometrydatabyusingdeepneuralnetworks
_version_ 1724760644778983424