Summary: | 碩士 === 國立交通大學 === 生物科技研究所 === 91 === In this thesis, support vector machine method(SVM)is applied based on various feature vectors to predict the cysteine states. SVM is based on the local sequence and amino acid contents of whole protein yields similar prediction performance, 81﹪, on a data set comprising 4136 cysteine-containing segments extracted from 969 non-homologous proteins and doing 20 cross-validation. These results are contrary to the previous findings that the amino acid contents of a whole protein are more informative about disulfide bonding states than local sequences (Mucchielli-Giorgi et al., 2002). However, using a combination of local sequence and the whole amino acid contents, my approach yields significantly higher predictor accuracy at 86﹪.
Aside from the consideration that the information embeds in the sequence, the relation between cysteines in a protein is also considered. So, there are many different state diagrams to describe the relationship of cysteines in a protein. Combining the method of SVM and the state diagram improves the prediction capability of cysteine state, and the accuracy can reach 90﹪.
|