Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm
Biological data is accumulated at a fast pace. However, raw data are generally difficult to understand and not useful unless we unlock the information hidden in the data. Knowledge/information can be extracted as the patterns or features buried within the data. Thus data mining, aims at uncovering u...
Main Author: | |
---|---|
Format: | Others |
Published: |
Digital Archive @ GSU
2007
|
Subjects: | |
Online Access: | http://digitalarchive.gsu.edu/cs_diss/17 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1016&context=cs_diss |
id |
ndltd-GEORGIA-oai-digitalarchive.gsu.edu-cs_diss-1016 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-GEORGIA-oai-digitalarchive.gsu.edu-cs_diss-10162013-04-23T03:18:55Z Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm Fu, Xuezheng Biological data is accumulated at a fast pace. However, raw data are generally difficult to understand and not useful unless we unlock the information hidden in the data. Knowledge/information can be extracted as the patterns or features buried within the data. Thus data mining, aims at uncovering underlying rules, relationships, and patterns in data, has emerged as one of the most exciting fields in computational science. In this dissertation, we develop efficient approaches to the structure pattern analysis of RNA and protein three dimensional structures. The major techniques used in this work include term rewriting and clustering algorithms. Firstly, a new approach is designed to study the interaction of RNA secondary structures motifs using the concept of term rewriting. Secondly, an improved K-means clustering algorithm is proposed to estimate the number of clusters in data. A new distance descriptor is introduced for the appropriate representation of three dimensional structure segments of RNA and protein three dimensional structures. The experimental results show the improvements in the determination of the number of clusters in data, evaluation of RNA structure similarity, RNA structure database search, and better understanding of the protein sequence-structure correspondence. 2007-06-27 text application/pdf http://digitalarchive.gsu.edu/cs_diss/17 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1016&context=cs_diss Computer Science Dissertations Digital Archive @ GSU Bioinformatics K-means clustering algorithm Term rewriting Stability Knowledge discovery Data mining Validation measure Computer Sciences |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
Bioinformatics K-means clustering algorithm Term rewriting Stability Knowledge discovery Data mining Validation measure Computer Sciences |
spellingShingle |
Bioinformatics K-means clustering algorithm Term rewriting Stability Knowledge discovery Data mining Validation measure Computer Sciences Fu, Xuezheng Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm |
description |
Biological data is accumulated at a fast pace. However, raw data are generally difficult to understand and not useful unless we unlock the information hidden in the data. Knowledge/information can be extracted as the patterns or features buried within the data. Thus data mining, aims at uncovering underlying rules, relationships, and patterns in data, has emerged as one of the most exciting fields in computational science. In this dissertation, we develop efficient approaches to the structure pattern analysis of RNA and protein three dimensional structures. The major techniques used in this work include term rewriting and clustering algorithms. Firstly, a new approach is designed to study the interaction of RNA secondary structures motifs using the concept of term rewriting. Secondly, an improved K-means clustering algorithm is proposed to estimate the number of clusters in data. A new distance descriptor is introduced for the appropriate representation of three dimensional structure segments of RNA and protein three dimensional structures. The experimental results show the improvements in the determination of the number of clusters in data, evaluation of RNA structure similarity, RNA structure database search, and better understanding of the protein sequence-structure correspondence. |
author |
Fu, Xuezheng |
author_facet |
Fu, Xuezheng |
author_sort |
Fu, Xuezheng |
title |
Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm |
title_short |
Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm |
title_full |
Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm |
title_fullStr |
Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm |
title_full_unstemmed |
Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm |
title_sort |
structure pattern analysis using term rewriting and clustering algorithm |
publisher |
Digital Archive @ GSU |
publishDate |
2007 |
url |
http://digitalarchive.gsu.edu/cs_diss/17 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1016&context=cs_diss |
work_keys_str_mv |
AT fuxuezheng structurepatternanalysisusingtermrewritingandclusteringalgorithm |
_version_ |
1716583949004701696 |