PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.

Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction beco...

Full description

Bibliographic Details
Main Authors: Liqi Li, Xiang Cui, Sanjiu Yu, Yuan Zhang, Zhong Luo, Hua Yang, Yue Zhou, Xiaoqi Zheng
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3968047?pdf=render
id doaj-0dedf7fbb1034a518458271cacd32e16
record_format Article
spelling doaj-0dedf7fbb1034a518458271cacd32e162020-11-25T01:46:27ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0193e9286310.1371/journal.pone.0092863PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.Liqi LiXiang CuiSanjiu YuYuan ZhangZhong LuoHua YangYue ZhouXiaoqi ZhengProtein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most homological-based approaches, the accuracies of protein structural class prediction are sufficiently high for high similarity datasets, but still far from being satisfactory for low similarity datasets, i.e., below 40% in pairwise sequence similarity. Therefore, we present a novel method for accurate and reliable protein structural class prediction for both high and low similarity datasets. This method is based on Support Vector Machine (SVM) in conjunction with integrated features from position-specific score matrix (PSSM), PROFEAT and Gene Ontology (GO). A feature selection approach, SVM-RFE, is also used to rank the integrated feature vectors through recursively removing the feature with the lowest ranking score. The definitive top features selected by SVM-RFE are input into the SVM engines to predict the structural class of a query protein. To validate our method, jackknife tests were applied to seven widely used benchmark datasets, reaching overall accuracies between 84.61% and 99.79%, which are significantly higher than those achieved by state-of-the-art tools. These results suggest that our method could serve as an accurate and cost-effective alternative to existing methods in protein structural classification, especially for low similarity datasets.http://europepmc.org/articles/PMC3968047?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Liqi Li
Xiang Cui
Sanjiu Yu
Yuan Zhang
Zhong Luo
Hua Yang
Yue Zhou
Xiaoqi Zheng
spellingShingle Liqi Li
Xiang Cui
Sanjiu Yu
Yuan Zhang
Zhong Luo
Hua Yang
Yue Zhou
Xiaoqi Zheng
PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.
PLoS ONE
author_facet Liqi Li
Xiang Cui
Sanjiu Yu
Yuan Zhang
Zhong Luo
Hua Yang
Yue Zhou
Xiaoqi Zheng
author_sort Liqi Li
title PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.
title_short PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.
title_full PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.
title_fullStr PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.
title_full_unstemmed PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.
title_sort pssp-rfe: accurate prediction of protein structural class by recursive feature extraction from psi-blast profile, physical-chemical property and functional annotations.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2014-01-01
description Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most homological-based approaches, the accuracies of protein structural class prediction are sufficiently high for high similarity datasets, but still far from being satisfactory for low similarity datasets, i.e., below 40% in pairwise sequence similarity. Therefore, we present a novel method for accurate and reliable protein structural class prediction for both high and low similarity datasets. This method is based on Support Vector Machine (SVM) in conjunction with integrated features from position-specific score matrix (PSSM), PROFEAT and Gene Ontology (GO). A feature selection approach, SVM-RFE, is also used to rank the integrated feature vectors through recursively removing the feature with the lowest ranking score. The definitive top features selected by SVM-RFE are input into the SVM engines to predict the structural class of a query protein. To validate our method, jackknife tests were applied to seven widely used benchmark datasets, reaching overall accuracies between 84.61% and 99.79%, which are significantly higher than those achieved by state-of-the-art tools. These results suggest that our method could serve as an accurate and cost-effective alternative to existing methods in protein structural classification, especially for low similarity datasets.
url http://europepmc.org/articles/PMC3968047?pdf=render
work_keys_str_mv AT liqili pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT xiangcui pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT sanjiuyu pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT yuanzhang pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT zhongluo pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT huayang pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT yuezhou pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT xiaoqizheng pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
_version_ 1725019416257626112