Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression
碩士 === 國立清華大學 === 資訊工程學系 === 97 === Recent years, the DNA microarray technology has played a key role in research on molecular biology. As the increase of experiments on biological processes over time, analyzing statistical patterns from time-series data has become a crucial step for exploring the c...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2009
|
Online Access: | http://ndltd.ncl.edu.tw/handle/89938953759201166172 |
id |
ndltd-TW-097NTHU5392106 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-097NTHU53921062015-11-13T04:08:49Z http://ndltd.ncl.edu.tw/handle/89938953759201166172 Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression 運用於基因表現時間序列的親和性互動式分群演算法 Chiu, Tai-Yu 邱泰諭 碩士 國立清華大學 資訊工程學系 97 Recent years, the DNA microarray technology has played a key role in research on molecular biology. As the increase of experiments on biological processes over time, analyzing statistical patterns from time-series data has become a crucial step for exploring the complex dynamics of biological systems. Due to the noise and measurements of uncertainty, the analysis task on time-series is more complicated than common data analysis. The early clustering methods such as k-means, Self-organizing Maps and hierarchical clustering neglect the temporal dependence between successive time points. The probabilistic model-based methods like dynamic Bayesian networks (DBN) and hidden Markov models (HMM) for clustering are more suitable for time-series but exist computation inefficiency. In this thesis, an unsupervised clustering algorithm which combines a recently proposed clustering scheme, Affinity Propagation, and the spirit of consensus clustering for multiple clustering partitions, is proposed. The proposed method investigates the relationship between genes across distinct time points through the interval selection from time points, and eliminates the influence of the noise and outliers. Our method produces a clustering result without a priori knowledge about the cluster number and exemplars, and demonstrate the significant clustering accuracy on the synthesis and real gene expression time-series datasets. Besides, the biological relevance of the clustering results is analyzed with the annotation of Gene Ontology, compared to early work. Our study provides the possible directions of clustering gene expression time-series data for future biological investigations. Wang, Jia-Shung 王家祥 2009 學位論文 ; thesis 46 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立清華大學 === 資訊工程學系 === 97 === Recent years, the DNA microarray technology has played a key role in research on molecular biology. As the increase of experiments on biological processes over time, analyzing statistical patterns from time-series data has become a crucial step for exploring the complex dynamics of biological systems. Due to the noise and measurements of uncertainty, the analysis task on time-series is more complicated than common data analysis. The early clustering methods such as k-means, Self-organizing Maps and hierarchical clustering neglect the temporal dependence between successive time points. The probabilistic model-based methods like dynamic Bayesian networks (DBN) and hidden Markov models (HMM) for clustering are more suitable for time-series but exist computation inefficiency. In this thesis, an unsupervised clustering algorithm which combines a recently proposed clustering scheme, Affinity Propagation, and the spirit of consensus clustering for multiple clustering partitions, is proposed. The proposed method investigates the relationship between genes across distinct time points through the interval selection from time points, and eliminates the influence of the noise and outliers. Our method produces a clustering result without a priori knowledge about the cluster number and exemplars, and demonstrate the significant clustering accuracy on the synthesis and real gene expression time-series datasets. Besides, the biological relevance of the clustering results is analyzed with the annotation of Gene Ontology, compared to early work. Our study provides the possible directions of clustering gene expression time-series data for future biological investigations.
|
author2 |
Wang, Jia-Shung |
author_facet |
Wang, Jia-Shung Chiu, Tai-Yu 邱泰諭 |
author |
Chiu, Tai-Yu 邱泰諭 |
spellingShingle |
Chiu, Tai-Yu 邱泰諭 Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression |
author_sort |
Chiu, Tai-Yu |
title |
Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression |
title_short |
Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression |
title_full |
Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression |
title_fullStr |
Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression |
title_full_unstemmed |
Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression |
title_sort |
affinity propagation based consensus clustering for time-series gene expression |
publishDate |
2009 |
url |
http://ndltd.ncl.edu.tw/handle/89938953759201166172 |
work_keys_str_mv |
AT chiutaiyu affinitypropagationbasedconsensusclusteringfortimeseriesgeneexpression AT qiūtàiyù affinitypropagationbasedconsensusclusteringfortimeseriesgeneexpression AT chiutaiyu yùnyòngyújīyīnbiǎoxiànshíjiānxùlièdeqīnhéxìnghùdòngshìfēnqúnyǎnsuànfǎ AT qiūtàiyù yùnyòngyújīyīnbiǎoxiànshíjiānxùlièdeqīnhéxìnghùdòngshìfēnqúnyǎnsuànfǎ |
_version_ |
1718128335443197952 |