PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection

Abstract Protein phosphorylation is a major form of post-translational modification (PTM) that regulates diverse cellular processes. In silico methods for phosphorylation site prediction can provide a useful and complementary strategy for complete phosphoproteome annotation. Here, we present a novel...

Full description

Bibliographic Details
Main Authors: Jiangning Song, Huilin Wang, Jiawei Wang, André Leier, Tatiana Marquez-Lago, Bingjiao Yang, Ziding Zhang, Tatsuya Akutsu, Geoffrey I. Webb, Roger J. Daly
Format: Article
Language:English
Published: Nature Publishing Group 2017-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-017-07199-4
id doaj-f266672a200d45e199456ae3087541bc
record_format Article
spelling doaj-f266672a200d45e199456ae3087541bc2020-12-08T00:46:10ZengNature Publishing GroupScientific Reports2045-23222017-07-017111910.1038/s41598-017-07199-4PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selectionJiangning Song0Huilin Wang1Jiawei Wang2André Leier3Tatiana Marquez-Lago4Bingjiao Yang5Ziding Zhang6Tatsuya Akutsu7Geoffrey I. Webb8Roger J. Daly9Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash UniversityDepartment of Chemical Biology, College of Chemistry and Chemical Engineering, Xiamen UniversityBiomedicine Discovery Institute and Department of Microbiology, Monash UniversityInformatics Institute and Department of Genetics, School of Medicine, University of Alabama at BirminghamInformatics Institute and Department of Genetics, School of Medicine, University of Alabama at BirminghamCollege of Mechanical Engineering, Yanshan UniversityState Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural UniversityBioinformatics Center, Institute for Chemical Research, Kyoto UniversityMonash Centre for Data Science, Faculty of Information Technology, Monash UniversityBiomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash UniversityAbstract Protein phosphorylation is a major form of post-translational modification (PTM) that regulates diverse cellular processes. In silico methods for phosphorylation site prediction can provide a useful and complementary strategy for complete phosphoproteome annotation. Here, we present a novel bioinformatics tool, PhosphoPredict, that combines protein sequence and functional features to predict kinase-specific substrates and their associated phosphorylation sites for 12 human kinases and kinase families, including ATM, CDKs, GSK-3, MAPKs, PKA, PKB, PKC, and SRC. To elucidate critical determinants, we identified feature subsets that were most informative and relevant for predicting substrate specificity for each individual kinase family. Extensive benchmarking experiments based on both five-fold cross-validation and independent tests indicated that the performance of PhosphoPredict is competitive with that of several other popular prediction tools, including KinasePhos, PPSP, GPS, and Musite. We found that combining protein functional and sequence features significantly improves phosphorylation site prediction performance across all kinases. Application of PhosphoPredict to the entire human proteome identified 150 to 800 potential phosphorylation substrates for each of the 12 kinases or kinase families. PhosphoPredict significantly extends the bioinformatics portfolio for kinase function analysis and will facilitate high-throughput identification of kinase-specific phosphorylation sites, thereby contributing to both basic and translational research programs.https://doi.org/10.1038/s41598-017-07199-4
collection DOAJ
language English
format Article
sources DOAJ
author Jiangning Song
Huilin Wang
Jiawei Wang
André Leier
Tatiana Marquez-Lago
Bingjiao Yang
Ziding Zhang
Tatsuya Akutsu
Geoffrey I. Webb
Roger J. Daly
spellingShingle Jiangning Song
Huilin Wang
Jiawei Wang
André Leier
Tatiana Marquez-Lago
Bingjiao Yang
Ziding Zhang
Tatsuya Akutsu
Geoffrey I. Webb
Roger J. Daly
PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection
Scientific Reports
author_facet Jiangning Song
Huilin Wang
Jiawei Wang
André Leier
Tatiana Marquez-Lago
Bingjiao Yang
Ziding Zhang
Tatsuya Akutsu
Geoffrey I. Webb
Roger J. Daly
author_sort Jiangning Song
title PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection
title_short PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection
title_full PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection
title_fullStr PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection
title_full_unstemmed PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection
title_sort phosphopredict: a bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection
publisher Nature Publishing Group
series Scientific Reports
issn 2045-2322
publishDate 2017-07-01
description Abstract Protein phosphorylation is a major form of post-translational modification (PTM) that regulates diverse cellular processes. In silico methods for phosphorylation site prediction can provide a useful and complementary strategy for complete phosphoproteome annotation. Here, we present a novel bioinformatics tool, PhosphoPredict, that combines protein sequence and functional features to predict kinase-specific substrates and their associated phosphorylation sites for 12 human kinases and kinase families, including ATM, CDKs, GSK-3, MAPKs, PKA, PKB, PKC, and SRC. To elucidate critical determinants, we identified feature subsets that were most informative and relevant for predicting substrate specificity for each individual kinase family. Extensive benchmarking experiments based on both five-fold cross-validation and independent tests indicated that the performance of PhosphoPredict is competitive with that of several other popular prediction tools, including KinasePhos, PPSP, GPS, and Musite. We found that combining protein functional and sequence features significantly improves phosphorylation site prediction performance across all kinases. Application of PhosphoPredict to the entire human proteome identified 150 to 800 potential phosphorylation substrates for each of the 12 kinases or kinase families. PhosphoPredict significantly extends the bioinformatics portfolio for kinase function analysis and will facilitate high-throughput identification of kinase-specific phosphorylation sites, thereby contributing to both basic and translational research programs.
url https://doi.org/10.1038/s41598-017-07199-4
work_keys_str_mv AT jiangningsong phosphopredictabioinformaticstoolforpredictionofhumankinasespecificphosphorylationsubstratesandsitesbyintegratingheterogeneousfeatureselection
AT huilinwang phosphopredictabioinformaticstoolforpredictionofhumankinasespecificphosphorylationsubstratesandsitesbyintegratingheterogeneousfeatureselection
AT jiaweiwang phosphopredictabioinformaticstoolforpredictionofhumankinasespecificphosphorylationsubstratesandsitesbyintegratingheterogeneousfeatureselection
AT andreleier phosphopredictabioinformaticstoolforpredictionofhumankinasespecificphosphorylationsubstratesandsitesbyintegratingheterogeneousfeatureselection
AT tatianamarquezlago phosphopredictabioinformaticstoolforpredictionofhumankinasespecificphosphorylationsubstratesandsitesbyintegratingheterogeneousfeatureselection
AT bingjiaoyang phosphopredictabioinformaticstoolforpredictionofhumankinasespecificphosphorylationsubstratesandsitesbyintegratingheterogeneousfeatureselection
AT zidingzhang phosphopredictabioinformaticstoolforpredictionofhumankinasespecificphosphorylationsubstratesandsitesbyintegratingheterogeneousfeatureselection
AT tatsuyaakutsu phosphopredictabioinformaticstoolforpredictionofhumankinasespecificphosphorylationsubstratesandsitesbyintegratingheterogeneousfeatureselection
AT geoffreyiwebb phosphopredictabioinformaticstoolforpredictionofhumankinasespecificphosphorylationsubstratesandsitesbyintegratingheterogeneousfeatureselection
AT rogerjdaly phosphopredictabioinformaticstoolforpredictionofhumankinasespecificphosphorylationsubstratesandsitesbyintegratingheterogeneousfeatureselection
_version_ 1724395874775203840