PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method

DNA-binding proteins (DBPs) play vital roles in all aspects of genetic activities. However, the identification of DBPs by using wet-lab experimental approaches is often time-consuming and laborious. In this study, we develop a novel computational method, called PredDBP-Stack, to predict DBPs solely...

Full description

Bibliographic Details
Main Authors:	Jun Wang, Huiwen Zheng, Yang Yang, Wanyue Xiao, Taigang Liu
Format:	Article
Language:	English
Published:	Hindawi Limited 2020-01-01
Series:	BioMed Research International
Online Access:	http://dx.doi.org/10.1155/2020/7297631

id	doaj-bbcc80e84b784c8a8d84f4fb233a1f1f
record_format	Article
spelling	doaj-bbcc80e84b784c8a8d84f4fb233a1f1f2020-11-25T02:41:30ZengHindawi LimitedBioMed Research International2314-61332314-61412020-01-01202010.1155/2020/72976317297631PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble MethodJun Wang0Huiwen Zheng1Yang Yang2Wanyue Xiao3Taigang Liu4College of Information, Shanghai Ocean University, Shanghai 201306, ChinaSchool of Engineering, University of Melbourne, Victoria 3010, AustraliaSchool of Information Management, Nanjing University, Nanjing 210023, ChinaSchool of Information, Syracuse University, Syracuse, NY 13244, USACollege of Information, Shanghai Ocean University, Shanghai 201306, ChinaDNA-binding proteins (DBPs) play vital roles in all aspects of genetic activities. However, the identification of DBPs by using wet-lab experimental approaches is often time-consuming and laborious. In this study, we develop a novel computational method, called PredDBP-Stack, to predict DBPs solely based on protein sequences. First, amino acid composition (AAC) and transition probability composition (TPC) extracted from the hidden markov model (HMM) profile are adopted to represent a protein. Next, we establish a stacked ensemble model to identify DBPs, which involves two stages of learning. In the first stage, the four base classifiers are trained with the features of HMM-based compositions. In the second stage, the prediction probabilities of these base classifiers are used as inputs to the meta-classifier to perform the final prediction of DBPs. Based on the PDB1075 benchmark dataset, we conduct a jackknife cross validation with the proposed PredDBP-Stack predictor and obtain a balanced sensitivity and specificity of 92.47% and 92.36%, respectively. This outcome outperforms most of the existing classifiers. Furthermore, our method also achieves superior performance and model robustness on the PDB186 independent dataset. This demonstrates that the PredDBP-Stack is an effective classifier for accurately identifying DBPs based on protein sequence information alone.http://dx.doi.org/10.1155/2020/7297631
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Jun Wang Huiwen Zheng Yang Yang Wanyue Xiao Taigang Liu
spellingShingle	Jun Wang Huiwen Zheng Yang Yang Wanyue Xiao Taigang Liu PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method BioMed Research International
author_facet	Jun Wang Huiwen Zheng Yang Yang Wanyue Xiao Taigang Liu
author_sort	Jun Wang
title	PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method
title_short	PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method
title_full	PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method
title_fullStr	PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method
title_full_unstemmed	PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method
title_sort	preddbp-stack: prediction of dna-binding proteins from hmm profiles using a stacked ensemble method
publisher	Hindawi Limited
series	BioMed Research International
issn	2314-6133 2314-6141
publishDate	2020-01-01
description	DNA-binding proteins (DBPs) play vital roles in all aspects of genetic activities. However, the identification of DBPs by using wet-lab experimental approaches is often time-consuming and laborious. In this study, we develop a novel computational method, called PredDBP-Stack, to predict DBPs solely based on protein sequences. First, amino acid composition (AAC) and transition probability composition (TPC) extracted from the hidden markov model (HMM) profile are adopted to represent a protein. Next, we establish a stacked ensemble model to identify DBPs, which involves two stages of learning. In the first stage, the four base classifiers are trained with the features of HMM-based compositions. In the second stage, the prediction probabilities of these base classifiers are used as inputs to the meta-classifier to perform the final prediction of DBPs. Based on the PDB1075 benchmark dataset, we conduct a jackknife cross validation with the proposed PredDBP-Stack predictor and obtain a balanced sensitivity and specificity of 92.47% and 92.36%, respectively. This outcome outperforms most of the existing classifiers. Furthermore, our method also achieves superior performance and model robustness on the PDB186 independent dataset. This demonstrates that the PredDBP-Stack is an effective classifier for accurately identifying DBPs based on protein sequence information alone.
url	http://dx.doi.org/10.1155/2020/7297631
work_keys_str_mv	AT junwang preddbpstackpredictionofdnabindingproteinsfromhmmprofilesusingastackedensemblemethod AT huiwenzheng preddbpstackpredictionofdnabindingproteinsfromhmmprofilesusingastackedensemblemethod AT yangyang preddbpstackpredictionofdnabindingproteinsfromhmmprofilesusingastackedensemblemethod AT wanyuexiao preddbpstackpredictionofdnabindingproteinsfromhmmprofilesusingastackedensemblemethod AT taigangliu preddbpstackpredictionofdnabindingproteinsfromhmmprofilesusingastackedensemblemethod
_version_	1715414500041031680

PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method

Similar Items