REALoc: Reliable and effective methods to assist predicting human protein subcellular localization

碩士 === 國立中興大學 === 基因體暨生物資訊學研究所 === 101 === Protein subcellular localization is an important part of biological research; which could support drug development and explore the function of proteins. Many subcellular localization prediction tools has developed, most of them used the data of eukaryotes o...

Full description

Bibliographic Details
Main Authors: Han-Hao Sun, 孫翰豪
Other Authors: 朱彥煒
Format: Others
Language:en_US
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/76013331557236563304
id ndltd-TW-101NCHU5105050
record_format oai_dc
spelling ndltd-TW-101NCHU51050502017-10-29T04:34:18Z http://ndltd.ncl.edu.tw/handle/76013331557236563304 REALoc: Reliable and effective methods to assist predicting human protein subcellular localization 採用可靠與有效的方法輔助預測人類蛋白質亞細胞位置 Han-Hao Sun 孫翰豪 碩士 國立中興大學 基因體暨生物資訊學研究所 101 Protein subcellular localization is an important part of biological research; which could support drug development and explore the function of proteins. Many subcellular localization prediction tools has developed, most of them used the data of eukaryotes or prokaryotes for model training, however, the related predictors for human proteins are rare. We established a system to predict subcellular localization of human proteins with Singleplex and Multiplex, called REALoc. It based on two layers architecture integrated with two different machine learning methods, one-to-one and many-to may. Besides, system included many sequence based features and function based features, such as amino acid composition, surface accessibility. In addition, we developed a series of computing features like weighted sign AAindex, sequence similarity profile and regular-mRMR feature selection for Gene Ontology. 5 folds Cross-validation was performed with iLoc-Hum on training dataset covers 6 location sites (Cell membrane, Cytoplasm, Endoplasmic reticulum/Golgi apparatus, Mitochondrion, Nucleus, secreted), overall absolute true success rate of REALoc is 75.34%, and on testing dataset is 57.14% which performances are about 10% higher than other four prediction systems. Finally, this study discussed the performance of the two decision mechanism of vote and GANN for predicting single location and multiple locations. Furthermore, the relationship between the protein-protein interaction and subcellular localization by using motifs was investigated. 朱彥煒 2013 學位論文 ; thesis 35 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中興大學 === 基因體暨生物資訊學研究所 === 101 === Protein subcellular localization is an important part of biological research; which could support drug development and explore the function of proteins. Many subcellular localization prediction tools has developed, most of them used the data of eukaryotes or prokaryotes for model training, however, the related predictors for human proteins are rare. We established a system to predict subcellular localization of human proteins with Singleplex and Multiplex, called REALoc. It based on two layers architecture integrated with two different machine learning methods, one-to-one and many-to may. Besides, system included many sequence based features and function based features, such as amino acid composition, surface accessibility. In addition, we developed a series of computing features like weighted sign AAindex, sequence similarity profile and regular-mRMR feature selection for Gene Ontology. 5 folds Cross-validation was performed with iLoc-Hum on training dataset covers 6 location sites (Cell membrane, Cytoplasm, Endoplasmic reticulum/Golgi apparatus, Mitochondrion, Nucleus, secreted), overall absolute true success rate of REALoc is 75.34%, and on testing dataset is 57.14% which performances are about 10% higher than other four prediction systems. Finally, this study discussed the performance of the two decision mechanism of vote and GANN for predicting single location and multiple locations. Furthermore, the relationship between the protein-protein interaction and subcellular localization by using motifs was investigated.
author2 朱彥煒
author_facet 朱彥煒
Han-Hao Sun
孫翰豪
author Han-Hao Sun
孫翰豪
spellingShingle Han-Hao Sun
孫翰豪
REALoc: Reliable and effective methods to assist predicting human protein subcellular localization
author_sort Han-Hao Sun
title REALoc: Reliable and effective methods to assist predicting human protein subcellular localization
title_short REALoc: Reliable and effective methods to assist predicting human protein subcellular localization
title_full REALoc: Reliable and effective methods to assist predicting human protein subcellular localization
title_fullStr REALoc: Reliable and effective methods to assist predicting human protein subcellular localization
title_full_unstemmed REALoc: Reliable and effective methods to assist predicting human protein subcellular localization
title_sort realoc: reliable and effective methods to assist predicting human protein subcellular localization
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/76013331557236563304
work_keys_str_mv AT hanhaosun realocreliableandeffectivemethodstoassistpredictinghumanproteinsubcellularlocalization
AT sūnhànháo realocreliableandeffectivemethodstoassistpredictinghumanproteinsubcellularlocalization
AT hanhaosun cǎiyòngkěkàoyǔyǒuxiàodefāngfǎfǔzhùyùcèrénlèidànbáizhìyàxìbāowèizhì
AT sūnhànháo cǎiyòngkěkàoyǔyǒuxiàodefāngfǎfǔzhùyùcèrénlèidànbáizhìyàxìbāowèizhì
_version_ 1718557304436031488