Protein Tyrosine Sulfation Sites Predcition: Based on Support Vector Machine and Pairwise Position Weighted Matrix of Amino Acid Sequence

碩士 === 國立交通大學 === 生醫工程研究所 === 101 === Protein tyrosine sulfation is one of the common post-translation modifications. Identifying the tyrosine sulfation sites is important for biologists to predict biochemical interactions. However, the determinant features of tyrosine sulfation sites are unknown. M...

Full description

Bibliographic Details
Main Authors: Huang, Po-Tsun, 黃柏淳
Other Authors: Ching, Yu-Tai
Format: Others
Language:en_US
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/30096393732610573630
id ndltd-TW-101NCTU5810114
record_format oai_dc
spelling ndltd-TW-101NCTU58101142016-07-02T04:20:16Z http://ndltd.ncl.edu.tw/handle/30096393732610573630 Protein Tyrosine Sulfation Sites Predcition: Based on Support Vector Machine and Pairwise Position Weighted Matrix of Amino Acid Sequence 利用支援向量機及胺基酸成對位置權重矩陣特徵預測蛋白質中酪胺酸硫酸化位置 Huang, Po-Tsun 黃柏淳 碩士 國立交通大學 生醫工程研究所 101 Protein tyrosine sulfation is one of the common post-translation modifications. Identifying the tyrosine sulfation sites is important for biologists to predict biochemical interactions. However, the determinant features of tyrosine sulfation sites are unknown. Moreover, the number of experimental sulfotyrosine sites is few, and the number of non-sulfotyrosine sites is 26 times more than the number of sulfotyrosine sites. The thesis presents a prediction method based on support vector machine (SVM) with amino acid sequence encoded by pairwise position weighted matrix (PPWM) to predict tyrosine sulfation sites. Due to the number of sulfotyrosine sites are less than non-sulfotyrosine sites, we incorporates resampling of training data to build multiple SVM models. The final prediction is made by a voting mechanism from those models. A single SVM model achieves an accuracy of 99.2% in average under five-fold cross validation. The proposed method achieves an accuracy of 98.3% when testing all known tyrosine sites with voting. In addition, we discovered that some patterns such as acidic amino acid occurs on each side of tyrosine residue, and Tryptophan (W) couples with acidic amino acid occur more frequently within sulfotyrosine subsequence by analyzing PPWM. The results may help biologists to discover tyrosine sulfation. Ching, Yu-Tai 荊宇泰 2013 學位論文 ; thesis 33 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 生醫工程研究所 === 101 === Protein tyrosine sulfation is one of the common post-translation modifications. Identifying the tyrosine sulfation sites is important for biologists to predict biochemical interactions. However, the determinant features of tyrosine sulfation sites are unknown. Moreover, the number of experimental sulfotyrosine sites is few, and the number of non-sulfotyrosine sites is 26 times more than the number of sulfotyrosine sites. The thesis presents a prediction method based on support vector machine (SVM) with amino acid sequence encoded by pairwise position weighted matrix (PPWM) to predict tyrosine sulfation sites. Due to the number of sulfotyrosine sites are less than non-sulfotyrosine sites, we incorporates resampling of training data to build multiple SVM models. The final prediction is made by a voting mechanism from those models. A single SVM model achieves an accuracy of 99.2% in average under five-fold cross validation. The proposed method achieves an accuracy of 98.3% when testing all known tyrosine sites with voting. In addition, we discovered that some patterns such as acidic amino acid occurs on each side of tyrosine residue, and Tryptophan (W) couples with acidic amino acid occur more frequently within sulfotyrosine subsequence by analyzing PPWM. The results may help biologists to discover tyrosine sulfation.
author2 Ching, Yu-Tai
author_facet Ching, Yu-Tai
Huang, Po-Tsun
黃柏淳
author Huang, Po-Tsun
黃柏淳
spellingShingle Huang, Po-Tsun
黃柏淳
Protein Tyrosine Sulfation Sites Predcition: Based on Support Vector Machine and Pairwise Position Weighted Matrix of Amino Acid Sequence
author_sort Huang, Po-Tsun
title Protein Tyrosine Sulfation Sites Predcition: Based on Support Vector Machine and Pairwise Position Weighted Matrix of Amino Acid Sequence
title_short Protein Tyrosine Sulfation Sites Predcition: Based on Support Vector Machine and Pairwise Position Weighted Matrix of Amino Acid Sequence
title_full Protein Tyrosine Sulfation Sites Predcition: Based on Support Vector Machine and Pairwise Position Weighted Matrix of Amino Acid Sequence
title_fullStr Protein Tyrosine Sulfation Sites Predcition: Based on Support Vector Machine and Pairwise Position Weighted Matrix of Amino Acid Sequence
title_full_unstemmed Protein Tyrosine Sulfation Sites Predcition: Based on Support Vector Machine and Pairwise Position Weighted Matrix of Amino Acid Sequence
title_sort protein tyrosine sulfation sites predcition: based on support vector machine and pairwise position weighted matrix of amino acid sequence
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/30096393732610573630
work_keys_str_mv AT huangpotsun proteintyrosinesulfationsitespredcitionbasedonsupportvectormachineandpairwisepositionweightedmatrixofaminoacidsequence
AT huángbǎichún proteintyrosinesulfationsitespredcitionbasedonsupportvectormachineandpairwisepositionweightedmatrixofaminoacidsequence
AT huangpotsun lìyòngzhīyuánxiàngliàngjījíànjīsuānchéngduìwèizhìquánzhòngjǔzhèntèzhēngyùcèdànbáizhìzhōnglàoànsuānliúsuānhuàwèizhì
AT huángbǎichún lìyòngzhīyuánxiàngliàngjījíànjīsuānchéngduìwèizhìquánzhòngjǔzhèntèzhēngyùcèdànbáizhìzhōnglàoànsuānliúsuānhuàwèizhì
_version_ 1718331604118536192