Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm

Biological data is accumulated at a fast pace. However, raw data are generally difficult to understand and not useful unless we unlock the information hidden in the data. Knowledge/information can be extracted as the patterns or features buried within the data. Thus data mining, aims at uncovering u...

Full description

Bibliographic Details
Main Author: Fu, Xuezheng
Format: Others
Published: Digital Archive @ GSU 2007
Subjects:
Online Access:http://digitalarchive.gsu.edu/cs_diss/17
http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1016&context=cs_diss
id ndltd-GEORGIA-oai-digitalarchive.gsu.edu-cs_diss-1016
record_format oai_dc
spelling ndltd-GEORGIA-oai-digitalarchive.gsu.edu-cs_diss-10162013-04-23T03:18:55Z Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm Fu, Xuezheng Biological data is accumulated at a fast pace. However, raw data are generally difficult to understand and not useful unless we unlock the information hidden in the data. Knowledge/information can be extracted as the patterns or features buried within the data. Thus data mining, aims at uncovering underlying rules, relationships, and patterns in data, has emerged as one of the most exciting fields in computational science. In this dissertation, we develop efficient approaches to the structure pattern analysis of RNA and protein three dimensional structures. The major techniques used in this work include term rewriting and clustering algorithms. Firstly, a new approach is designed to study the interaction of RNA secondary structures motifs using the concept of term rewriting. Secondly, an improved K-means clustering algorithm is proposed to estimate the number of clusters in data. A new distance descriptor is introduced for the appropriate representation of three dimensional structure segments of RNA and protein three dimensional structures. The experimental results show the improvements in the determination of the number of clusters in data, evaluation of RNA structure similarity, RNA structure database search, and better understanding of the protein sequence-structure correspondence. 2007-06-27 text application/pdf http://digitalarchive.gsu.edu/cs_diss/17 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1016&context=cs_diss Computer Science Dissertations Digital Archive @ GSU Bioinformatics K-means clustering algorithm Term rewriting Stability Knowledge discovery Data mining Validation measure Computer Sciences
collection NDLTD
format Others
sources NDLTD
topic Bioinformatics
K-means clustering algorithm
Term rewriting
Stability
Knowledge discovery
Data mining
Validation measure
Computer Sciences
spellingShingle Bioinformatics
K-means clustering algorithm
Term rewriting
Stability
Knowledge discovery
Data mining
Validation measure
Computer Sciences
Fu, Xuezheng
Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm
description Biological data is accumulated at a fast pace. However, raw data are generally difficult to understand and not useful unless we unlock the information hidden in the data. Knowledge/information can be extracted as the patterns or features buried within the data. Thus data mining, aims at uncovering underlying rules, relationships, and patterns in data, has emerged as one of the most exciting fields in computational science. In this dissertation, we develop efficient approaches to the structure pattern analysis of RNA and protein three dimensional structures. The major techniques used in this work include term rewriting and clustering algorithms. Firstly, a new approach is designed to study the interaction of RNA secondary structures motifs using the concept of term rewriting. Secondly, an improved K-means clustering algorithm is proposed to estimate the number of clusters in data. A new distance descriptor is introduced for the appropriate representation of three dimensional structure segments of RNA and protein three dimensional structures. The experimental results show the improvements in the determination of the number of clusters in data, evaluation of RNA structure similarity, RNA structure database search, and better understanding of the protein sequence-structure correspondence.
author Fu, Xuezheng
author_facet Fu, Xuezheng
author_sort Fu, Xuezheng
title Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm
title_short Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm
title_full Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm
title_fullStr Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm
title_full_unstemmed Structure Pattern Analysis Using Term Rewriting and Clustering Algorithm
title_sort structure pattern analysis using term rewriting and clustering algorithm
publisher Digital Archive @ GSU
publishDate 2007
url http://digitalarchive.gsu.edu/cs_diss/17
http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1016&context=cs_diss
work_keys_str_mv AT fuxuezheng structurepatternanalysisusingtermrewritingandclusteringalgorithm
_version_ 1716583949004701696