Identification of ATP Binding Protein Using Support Vector Machine and Genetic Algorithm Based on n-peptide Composition

碩士 === 逢甲大學 === 資訊工程學系 === 102 === ATP binding proteins (ABPs) have binding sites that interact with ATP. The ATP will interact with the binding sites and release the chemical energy to power the ATP binding proteins for mechanical work. Most of the ATP binding proteins are transmembrane proteins an...

Full description

Bibliographic Details
Main Authors: Mao-Lun Wei, 韋懋綸
Other Authors: 游景盛
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/94849439189706536316
Description
Summary:碩士 === 逢甲大學 === 資訊工程學系 === 102 === ATP binding proteins (ABPs) have binding sites that interact with ATP. The ATP will interact with the binding sites and release the chemical energy to power the ATP binding proteins for mechanical work. Most of the ATP binding proteins are transmembrane proteins and responsible for transport of substrates across extra and intracellular membranes. ATP binding proteins also involve in muscle contraction, regulation metabolic processes. We calculate several n-peptide composition form protein sequence, and use the genetic algorithm to reduce the dimension of n-peptide composition. We use the support vector machine and five fold cross validation to achieve the prediction results. We discover that the genetic algorithm can reduce the dimension of feature vector and improve the prediction performance. For example the dimension of Dj3 reduce from 441 to 179 and the Matthews correlation coefficient (MCC) increase from 0.36 to 0.51.We also propose a voting strategy and integration strategy to improve the prediction performance. The voting strategy achieve MCC 0.68 and the integration strategy achieve MCC 0.56. The results show that these two strategies can improve the prediction performance. We try to discover that whether the ATP binding proteins with different function or structural classes have different prediction performance. We classify the dataset with the SCOP fold information and the Gene Ontology terms. We found that the prediction performance of some classes is low. It seems necessary to develop a specific prediction tool for the classes with low prediction performance. Keywords : Support Vector Machine、ATP Binding Protein (ABP)、n-peptide Composition、Genetic Algorithm