Prediction of protein thermostability using Decision Tree base on sequence and structure features

碩士 === 國立中央大學 === 資訊工程研究所 === 93 === The protein thermostability information is closely related to production of many biomaterials. Recent developments in research on the proteins thermostability find out the significant features for thermal stability of protein according to comparisons between homo...

Full description

Bibliographic Details
Main Authors: Jian-Sin Li, 李見信
Other Authors: Jorng-Tzong Horng
Format: Others
Language:en_US
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/57058783926064390275
Description
Summary:碩士 === 國立中央大學 === 資訊工程研究所 === 93 === The protein thermostability information is closely related to production of many biomaterials. Recent developments in research on the proteins thermostability find out the significant features for thermal stability of protein according to comparisons between homologous proteins. The amino acid composition, special pattern in sequence information and hydrogen bond, disulfide bond, salt bridges and so on in protein structure are considered important for thermostability. In this study, we present a system to integrate various factors to predict protein thermostability. In our research, a large number of proteins are from PGTdb and PDB. To start with, fetch out various features form sequences and structures. Then, feature selection algorithm is used to filter the features that have higher linear correlation coefficient to thermostability. Lastly, we apply these features to machine learning approach to built a predict system. In this research we discover two features, i.e., (E+F+M+R)/residue and charged/noncharged have linear correlation to thermostability. We finally establish two predict systems, one can predict protein thermostability by inputting protein sequences only, and the other can get better performance if the protein structure is known.