Clustering and Characterizing Local Protein Structures by an Expectation-Maximization (EM)-Assisted Approach
碩士 === 國立臺灣大學 === 醫學工程學研究所 === 90 === It is important to understand the biochemical functions within the body by studying the protein structure. In predicting the protein structure, one approach is to predict local conformations of a protein and then reassemble the substructures. Previous studies ha...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2002
|
Online Access: | http://ndltd.ncl.edu.tw/handle/91604425775207861302 |
id |
ndltd-TW-090NTU00530021 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-090NTU005300212015-10-13T14:41:12Z http://ndltd.ncl.edu.tw/handle/91604425775207861302 Clustering and Characterizing Local Protein Structures by an Expectation-Maximization (EM)-Assisted Approach 蛋白質局部重複性結構之分析-以EM為輔助之群聚演算法 Ta-tsen Soong 宋大辰 碩士 國立臺灣大學 醫學工程學研究所 90 It is important to understand the biochemical functions within the body by studying the protein structure. In predicting the protein structure, one approach is to predict local conformations of a protein and then reassemble the substructures. Previous studies have demonstrated that using a small set of local structures can help reconstruct or build a protein molecule with high precision. Our study follows the same framework and proposes a method for finding recurrent local structures of proteins. The algorithm starts by applying Expectation-Maximization (EM) clustering to the distance matrices of pentamer fragment structures. A rough partition of the conformation space can thus be derived. Then by subjecting the EM clusters to the split-and-merge algorithm in the second stage, we can obtain a finite number of clusters and guarantee the homogeneity and distinctiveness of each one (i.e. each cluster consists of very similar structures and is different from other clusters). The results show that, with 41 major representative structures, we can approximate a test set of protein fragments with an error of 0.378 Å. With only 20 types of structures, the test set can be modeled at 0.44 Å, which is comparable to the performance of a previous method (i.e. the oligons [24]). This study also compiled a position-specific frequency map for each of the clusters. The frequency maps will help discover the sequence-structure relationship in future studies. Chung-ming Chen Ming-jing Hwang 陳中明 黃明經 2002 學位論文 ; thesis 84 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 醫學工程學研究所 === 90 === It is important to understand the biochemical functions within the body by studying the protein structure. In predicting the protein structure, one approach is to predict local conformations of a protein and then reassemble the substructures. Previous studies have demonstrated that using a small set of local structures can help reconstruct or build a protein molecule with high precision. Our study follows the same framework and proposes a method for finding recurrent local structures of proteins. The algorithm starts by applying Expectation-Maximization (EM) clustering to the distance matrices of pentamer fragment structures. A rough partition of the conformation space can thus be derived. Then by subjecting the EM clusters to the split-and-merge algorithm in the second stage, we can obtain a finite number of clusters and guarantee the homogeneity and distinctiveness of each one (i.e. each cluster consists of very similar structures and is different from other clusters). The results show that, with 41 major representative structures, we can approximate a test set of protein fragments with an error of 0.378 Å. With only 20 types of structures, the test set can be modeled at 0.44 Å, which is comparable to the performance of a previous method (i.e. the oligons [24]). This study also compiled a position-specific frequency map for each of the clusters. The frequency maps will help discover the sequence-structure relationship in future studies.
|
author2 |
Chung-ming Chen |
author_facet |
Chung-ming Chen Ta-tsen Soong 宋大辰 |
author |
Ta-tsen Soong 宋大辰 |
spellingShingle |
Ta-tsen Soong 宋大辰 Clustering and Characterizing Local Protein Structures by an Expectation-Maximization (EM)-Assisted Approach |
author_sort |
Ta-tsen Soong |
title |
Clustering and Characterizing Local Protein Structures by an Expectation-Maximization (EM)-Assisted Approach |
title_short |
Clustering and Characterizing Local Protein Structures by an Expectation-Maximization (EM)-Assisted Approach |
title_full |
Clustering and Characterizing Local Protein Structures by an Expectation-Maximization (EM)-Assisted Approach |
title_fullStr |
Clustering and Characterizing Local Protein Structures by an Expectation-Maximization (EM)-Assisted Approach |
title_full_unstemmed |
Clustering and Characterizing Local Protein Structures by an Expectation-Maximization (EM)-Assisted Approach |
title_sort |
clustering and characterizing local protein structures by an expectation-maximization (em)-assisted approach |
publishDate |
2002 |
url |
http://ndltd.ncl.edu.tw/handle/91604425775207861302 |
work_keys_str_mv |
AT tatsensoong clusteringandcharacterizinglocalproteinstructuresbyanexpectationmaximizationemassistedapproach AT sòngdàchén clusteringandcharacterizinglocalproteinstructuresbyanexpectationmaximizationemassistedapproach AT tatsensoong dànbáizhìjúbùzhòngfùxìngjiégòuzhīfēnxīyǐemwèifǔzhùzhīqúnjùyǎnsuànfǎ AT sòngdàchén dànbáizhìjúbùzhòngfùxìngjiégòuzhīfēnxīyǐemwèifǔzhùzhīqúnjùyǎnsuànfǎ |
_version_ |
1717755921782800384 |