Identification of ATP Binding Protein Using Support Vector Machine and Genetic Algorithm Based on n-peptide Composition
碩士 === 逢甲大學 === 資訊工程學系 === 102 === ATP binding proteins (ABPs) have binding sites that interact with ATP. The ATP will interact with the binding sites and release the chemical energy to power the ATP binding proteins for mechanical work. Most of the ATP binding proteins are transmembrane proteins an...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2014
|
Online Access: | http://ndltd.ncl.edu.tw/handle/94849439189706536316 |
id |
ndltd-TW-102FCU05392032 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-102FCU053920322015-10-13T23:49:49Z http://ndltd.ncl.edu.tw/handle/94849439189706536316 Identification of ATP Binding Protein Using Support Vector Machine and Genetic Algorithm Based on n-peptide Composition 利用支持向量機及遺傳演算法基於n-peptide組成分辨識ATP結合蛋白質 Mao-Lun Wei 韋懋綸 碩士 逢甲大學 資訊工程學系 102 ATP binding proteins (ABPs) have binding sites that interact with ATP. The ATP will interact with the binding sites and release the chemical energy to power the ATP binding proteins for mechanical work. Most of the ATP binding proteins are transmembrane proteins and responsible for transport of substrates across extra and intracellular membranes. ATP binding proteins also involve in muscle contraction, regulation metabolic processes. We calculate several n-peptide composition form protein sequence, and use the genetic algorithm to reduce the dimension of n-peptide composition. We use the support vector machine and five fold cross validation to achieve the prediction results. We discover that the genetic algorithm can reduce the dimension of feature vector and improve the prediction performance. For example the dimension of Dj3 reduce from 441 to 179 and the Matthews correlation coefficient (MCC) increase from 0.36 to 0.51.We also propose a voting strategy and integration strategy to improve the prediction performance. The voting strategy achieve MCC 0.68 and the integration strategy achieve MCC 0.56. The results show that these two strategies can improve the prediction performance. We try to discover that whether the ATP binding proteins with different function or structural classes have different prediction performance. We classify the dataset with the SCOP fold information and the Gene Ontology terms. We found that the prediction performance of some classes is low. It seems necessary to develop a specific prediction tool for the classes with low prediction performance. Keywords : Support Vector Machine、ATP Binding Protein (ABP)、n-peptide Composition、Genetic Algorithm 游景盛 2014 學位論文 ; thesis 53 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 逢甲大學 === 資訊工程學系 === 102 === ATP binding proteins (ABPs) have binding sites that interact with ATP. The ATP will interact with the binding sites and release the chemical energy to power the ATP binding proteins for mechanical work. Most of the ATP binding proteins are transmembrane proteins and responsible for transport of substrates across extra and intracellular membranes. ATP binding proteins also involve in muscle contraction, regulation metabolic processes. We calculate several n-peptide composition form protein sequence, and use the genetic algorithm to reduce the dimension of n-peptide composition. We use the support vector machine and five fold cross validation to achieve the prediction results. We discover that the genetic algorithm can reduce the dimension of feature vector and improve the prediction performance. For example the dimension of Dj3 reduce from 441 to 179 and the Matthews correlation coefficient (MCC) increase from 0.36 to 0.51.We also propose a voting strategy and integration strategy to improve the prediction performance. The voting strategy achieve MCC 0.68 and the integration strategy achieve MCC 0.56. The results show that these two strategies can improve the prediction performance. We try to discover that whether the ATP binding proteins with different function or structural classes have different prediction performance. We classify the dataset with the SCOP fold information and the Gene Ontology terms. We found that the prediction performance of some classes is low. It seems necessary to develop a specific prediction tool for the classes with low prediction performance.
Keywords : Support Vector Machine、ATP Binding Protein (ABP)、n-peptide Composition、Genetic Algorithm
|
author2 |
游景盛 |
author_facet |
游景盛 Mao-Lun Wei 韋懋綸 |
author |
Mao-Lun Wei 韋懋綸 |
spellingShingle |
Mao-Lun Wei 韋懋綸 Identification of ATP Binding Protein Using Support Vector Machine and Genetic Algorithm Based on n-peptide Composition |
author_sort |
Mao-Lun Wei |
title |
Identification of ATP Binding Protein Using Support Vector Machine and Genetic Algorithm Based on n-peptide Composition |
title_short |
Identification of ATP Binding Protein Using Support Vector Machine and Genetic Algorithm Based on n-peptide Composition |
title_full |
Identification of ATP Binding Protein Using Support Vector Machine and Genetic Algorithm Based on n-peptide Composition |
title_fullStr |
Identification of ATP Binding Protein Using Support Vector Machine and Genetic Algorithm Based on n-peptide Composition |
title_full_unstemmed |
Identification of ATP Binding Protein Using Support Vector Machine and Genetic Algorithm Based on n-peptide Composition |
title_sort |
identification of atp binding protein using support vector machine and genetic algorithm based on n-peptide composition |
publishDate |
2014 |
url |
http://ndltd.ncl.edu.tw/handle/94849439189706536316 |
work_keys_str_mv |
AT maolunwei identificationofatpbindingproteinusingsupportvectormachineandgeneticalgorithmbasedonnpeptidecomposition AT wéimàolún identificationofatpbindingproteinusingsupportvectormachineandgeneticalgorithmbasedonnpeptidecomposition AT maolunwei lìyòngzhīchíxiàngliàngjījíyíchuányǎnsuànfǎjīyúnpeptidezǔchéngfēnbiànshíatpjiéhédànbáizhì AT wéimàolún lìyòngzhīchíxiàngliàngjījíyíchuányǎnsuànfǎjīyúnpeptidezǔchéngfēnbiànshíatpjiéhédànbáizhì |
_version_ |
1718087105556512768 |