PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection
Protein quaternary structure complex is also known as a multimer, which plays an important role in a cell. The dimer structure of transcription factors is involved in gene regulation, but the trimer structure of virus-infection-associated glycoproteins is related to the human immunodeficiency virus....
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2018-02-01
|
Series: | Genes |
Subjects: | |
Online Access: | http://www.mdpi.com/2073-4425/9/2/91 |
id |
doaj-a7e17443c716481992f6b8604cbb409b |
---|---|
record_format |
Article |
spelling |
doaj-a7e17443c716481992f6b8604cbb409b2020-11-24T23:41:36ZengMDPI AGGenes2073-44252018-02-01929110.3390/genes9020091genes9020091PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model SelectionChi-Chou Huang0Chi-Chang Chang1Chi-Wei Chen2Shao-yu Ho3Hsung-Pin Chang4Yen-Wei Chu5School of Medicine, Chung Shan Medical University, Taichung 40201, TaiwanSchool of Medical Informatics, Chung-Shan Medical University, Taichung 40201, TaiwanInstitute of Genomics and Bioinformatics, National Chung Hsing University, Kuo Kuang Rd., Taichung 402, TaiwanInstitute of Genomics and Bioinformatics, National Chung Hsing University, Kuo Kuang Rd., Taichung 402, TaiwanDepartment of Computer Science and Engineering, National Chung-Hsing University, Kuo Kuang Rd., Taichung 402, TaiwanInstitute of Genomics and Bioinformatics, National Chung Hsing University, Kuo Kuang Rd., Taichung 402, TaiwanProtein quaternary structure complex is also known as a multimer, which plays an important role in a cell. The dimer structure of transcription factors is involved in gene regulation, but the trimer structure of virus-infection-associated glycoproteins is related to the human immunodeficiency virus. The classification of the protein quaternary structure complex for the post-genome era of proteomics research will be of great help. Classification systems among protein quaternary structures have not been widely developed. Therefore, we designed the architecture of a two-layer machine learning technique in this study, and developed the classification system PClass. The protein quaternary structure of the complex is divided into five categories, namely, monomer, dimer, trimer, tetramer, and other subunit classes. In the framework of the bootstrap method with a support vector machine, we propose a new model selection method. Each type of complex is classified based on sequences, entropy, and accessible surface area, thereby generating a plurality of feature modules. Subsequently, the optimal model of effectiveness is selected as each kind of complex feature module. In this stage, the optimal performance can reach as high as 70% of Matthews correlation coefficient (MCC). The second layer of construction combines the first-layer module to integrate mechanisms and the use of six machine learning methods to improve the prediction performance. This system can be improved over 10% in MCC. Finally, we analyzed the performance of our classification system using transcription factors in dimer structure and virus-infection-associated glycoprotein in trimer structure. PClass is available via a web interface at http://predictor.nchu.edu.tw/PClass/.http://www.mdpi.com/2073-4425/9/2/91protein quaternary structurebootstrap strategymodel selectionclassification |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chi-Chou Huang Chi-Chang Chang Chi-Wei Chen Shao-yu Ho Hsung-Pin Chang Yen-Wei Chu |
spellingShingle |
Chi-Chou Huang Chi-Chang Chang Chi-Wei Chen Shao-yu Ho Hsung-Pin Chang Yen-Wei Chu PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection Genes protein quaternary structure bootstrap strategy model selection classification |
author_facet |
Chi-Chou Huang Chi-Chang Chang Chi-Wei Chen Shao-yu Ho Hsung-Pin Chang Yen-Wei Chu |
author_sort |
Chi-Chou Huang |
title |
PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection |
title_short |
PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection |
title_full |
PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection |
title_fullStr |
PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection |
title_full_unstemmed |
PClass: Protein Quaternary Structure Classification by Using Bootstrapping Strategy as Model Selection |
title_sort |
pclass: protein quaternary structure classification by using bootstrapping strategy as model selection |
publisher |
MDPI AG |
series |
Genes |
issn |
2073-4425 |
publishDate |
2018-02-01 |
description |
Protein quaternary structure complex is also known as a multimer, which plays an important role in a cell. The dimer structure of transcription factors is involved in gene regulation, but the trimer structure of virus-infection-associated glycoproteins is related to the human immunodeficiency virus. The classification of the protein quaternary structure complex for the post-genome era of proteomics research will be of great help. Classification systems among protein quaternary structures have not been widely developed. Therefore, we designed the architecture of a two-layer machine learning technique in this study, and developed the classification system PClass. The protein quaternary structure of the complex is divided into five categories, namely, monomer, dimer, trimer, tetramer, and other subunit classes. In the framework of the bootstrap method with a support vector machine, we propose a new model selection method. Each type of complex is classified based on sequences, entropy, and accessible surface area, thereby generating a plurality of feature modules. Subsequently, the optimal model of effectiveness is selected as each kind of complex feature module. In this stage, the optimal performance can reach as high as 70% of Matthews correlation coefficient (MCC). The second layer of construction combines the first-layer module to integrate mechanisms and the use of six machine learning methods to improve the prediction performance. This system can be improved over 10% in MCC. Finally, we analyzed the performance of our classification system using transcription factors in dimer structure and virus-infection-associated glycoprotein in trimer structure. PClass is available via a web interface at http://predictor.nchu.edu.tw/PClass/. |
topic |
protein quaternary structure bootstrap strategy model selection classification |
url |
http://www.mdpi.com/2073-4425/9/2/91 |
work_keys_str_mv |
AT chichouhuang pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection AT chichangchang pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection AT chiweichen pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection AT shaoyuho pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection AT hsungpinchang pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection AT yenweichu pclassproteinquaternarystructureclassificationbyusingbootstrappingstrategyasmodelselection |
_version_ |
1725506425399017472 |