Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression

碩士 === 國立清華大學 === 資訊工程學系 === 97 === Recent years, the DNA microarray technology has played a key role in research on molecular biology. As the increase of experiments on biological processes over time, analyzing statistical patterns from time-series data has become a crucial step for exploring the c...

Full description

Bibliographic Details
Main Authors: Chiu, Tai-Yu, 邱泰諭
Other Authors: Wang, Jia-Shung
Format: Others
Language:en_US
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/89938953759201166172
id ndltd-TW-097NTHU5392106
record_format oai_dc
spelling ndltd-TW-097NTHU53921062015-11-13T04:08:49Z http://ndltd.ncl.edu.tw/handle/89938953759201166172 Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression 運用於基因表現時間序列的親和性互動式分群演算法 Chiu, Tai-Yu 邱泰諭 碩士 國立清華大學 資訊工程學系 97 Recent years, the DNA microarray technology has played a key role in research on molecular biology. As the increase of experiments on biological processes over time, analyzing statistical patterns from time-series data has become a crucial step for exploring the complex dynamics of biological systems. Due to the noise and measurements of uncertainty, the analysis task on time-series is more complicated than common data analysis. The early clustering methods such as k-means, Self-organizing Maps and hierarchical clustering neglect the temporal dependence between successive time points. The probabilistic model-based methods like dynamic Bayesian networks (DBN) and hidden Markov models (HMM) for clustering are more suitable for time-series but exist computation inefficiency. In this thesis, an unsupervised clustering algorithm which combines a recently proposed clustering scheme, Affinity Propagation, and the spirit of consensus clustering for multiple clustering partitions, is proposed. The proposed method investigates the relationship between genes across distinct time points through the interval selection from time points, and eliminates the influence of the noise and outliers. Our method produces a clustering result without a priori knowledge about the cluster number and exemplars, and demonstrate the significant clustering accuracy on the synthesis and real gene expression time-series datasets. Besides, the biological relevance of the clustering results is analyzed with the annotation of Gene Ontology, compared to early work. Our study provides the possible directions of clustering gene expression time-series data for future biological investigations. Wang, Jia-Shung 王家祥 2009 學位論文 ; thesis 46 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立清華大學 === 資訊工程學系 === 97 === Recent years, the DNA microarray technology has played a key role in research on molecular biology. As the increase of experiments on biological processes over time, analyzing statistical patterns from time-series data has become a crucial step for exploring the complex dynamics of biological systems. Due to the noise and measurements of uncertainty, the analysis task on time-series is more complicated than common data analysis. The early clustering methods such as k-means, Self-organizing Maps and hierarchical clustering neglect the temporal dependence between successive time points. The probabilistic model-based methods like dynamic Bayesian networks (DBN) and hidden Markov models (HMM) for clustering are more suitable for time-series but exist computation inefficiency. In this thesis, an unsupervised clustering algorithm which combines a recently proposed clustering scheme, Affinity Propagation, and the spirit of consensus clustering for multiple clustering partitions, is proposed. The proposed method investigates the relationship between genes across distinct time points through the interval selection from time points, and eliminates the influence of the noise and outliers. Our method produces a clustering result without a priori knowledge about the cluster number and exemplars, and demonstrate the significant clustering accuracy on the synthesis and real gene expression time-series datasets. Besides, the biological relevance of the clustering results is analyzed with the annotation of Gene Ontology, compared to early work. Our study provides the possible directions of clustering gene expression time-series data for future biological investigations.
author2 Wang, Jia-Shung
author_facet Wang, Jia-Shung
Chiu, Tai-Yu
邱泰諭
author Chiu, Tai-Yu
邱泰諭
spellingShingle Chiu, Tai-Yu
邱泰諭
Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression
author_sort Chiu, Tai-Yu
title Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression
title_short Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression
title_full Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression
title_fullStr Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression
title_full_unstemmed Affinity Propagation Based Consensus Clustering for Time-Series Gene Expression
title_sort affinity propagation based consensus clustering for time-series gene expression
publishDate 2009
url http://ndltd.ncl.edu.tw/handle/89938953759201166172
work_keys_str_mv AT chiutaiyu affinitypropagationbasedconsensusclusteringfortimeseriesgeneexpression
AT qiūtàiyù affinitypropagationbasedconsensusclusteringfortimeseriesgeneexpression
AT chiutaiyu yùnyòngyújīyīnbiǎoxiànshíjiānxùlièdeqīnhéxìnghùdòngshìfēnqúnyǎnsuànfǎ
AT qiūtàiyù yùnyòngyújīyīnbiǎoxiànshíjiānxùlièdeqīnhéxìnghùdòngshìfēnqúnyǎnsuànfǎ
_version_ 1718128335443197952