Summary: | 碩士 === 臺中健康暨管理學院 === 生物資訊研究所 === 93 === In this thesis, I investigated how the amino acids physicochemical environment information, such as the protein secondary structures and residues solvent accessibility, could possibly enhance one’s capability for protein classes classification prediction.
The score matrices for several classes (all-, all-, and according to the SCOP classification) of known protein sequences were computed. Sequences are taken from a protein secondary structure database, for example, the DSSP secondary structure protein databases. Thus, one can construct the 3D structure profiles for each entry in the PDB database. These profiles are used to score the query protein sequence to be modeled for compatibility with the known classes classification.
To demonstrate the 3D structure profile method is able to detect sequences compatible with a known class, one aligns the query sequences with the environment of a known protein structure using a simple sequence alignment algorithm. My study indicated that the method has larger than 95% accuracy in protein classes assignment(average score <0.5). Furthermore, I had also established the fact that the structure profile approach is able to detect distant sequences well below the twilight zone (less than 25% sequence similarity).
|