Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes

碩士 === 國立臺灣大學 === 工業工程學研究所 === 93 === The unsupervised classification methods, Clustering analysis and Factor analysis, intend to find meaningful structures existing in the observed attributes. These structures are usually expressed by grouping of attributes based on the similarities, or relationshi...

Full description

Bibliographic Details
Main Authors:	Chen-Sui Lin, 林辰穗
Other Authors:	陳正剛
Format:	Others
Language:	en_US
Published:	2005
Online Access:	http://ndltd.ncl.edu.tw/handle/50338512156095774744

id	ndltd-TW-093NTU05030006
record_format	oai_dc
spelling	ndltd-TW-093NTU050300062015-12-21T04:04:04Z http://ndltd.ncl.edu.tw/handle/50338512156095774744 Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes 考慮相關性之群集分析及其在基因分群上的應用 Chen-Sui Lin 林辰穗碩士國立臺灣大學工業工程學研究所 93 The unsupervised classification methods, Clustering analysis and Factor analysis, intend to find meaningful structures existing in the observed attributes. These structures are usually expressed by grouping of attributes based on the similarities, or relationships among the attributes. However, the disadvantage of Factor analysis lies on insufficiency of full-rank in numerical computation. For example, in microarray data analysis, expressions of 10,000~20,000 genes are collected for each array. The number of genes is usually far larger than number of microarray. Clustering analysis, on the other hand, can help handle with a vast amount of attributes with few samples. There are some drawbacks of Clustering analysis, including of misapplying the correlation coefficient and the difficulties of evaluating the cluster quality as well as the determination of the cluster number. In this research, we first discuss characterization of interrelationships among attributes, and then develop clustering methods suitable for grouping interrelated attributes. The “R2 with PCA” method lays more stress on the linear relationships between two clusters, while the “Variance explanation” method focuses not only on interrelations among attributes but also on attributes variations. This research also proposes the statistics for the evaluation of the cluster quality, and these statistics take into considerations the interrelationships among clusters and the variances explained of clusters. Finally, we apply these novel methods to two cases; one is 19 blood tests of 24 human; and the other is Down syndrome microarray data. 陳正剛 2005 學位論文 ; thesis 95 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立臺灣大學 === 工業工程學研究所 === 93 === The unsupervised classification methods, Clustering analysis and Factor analysis, intend to find meaningful structures existing in the observed attributes. These structures are usually expressed by grouping of attributes based on the similarities, or relationships among the attributes. However, the disadvantage of Factor analysis lies on insufficiency of full-rank in numerical computation. For example, in microarray data analysis, expressions of 10,000~20,000 genes are collected for each array. The number of genes is usually far larger than number of microarray. Clustering analysis, on the other hand, can help handle with a vast amount of attributes with few samples. There are some drawbacks of Clustering analysis, including of misapplying the correlation coefficient and the difficulties of evaluating the cluster quality as well as the determination of the cluster number. In this research, we first discuss characterization of interrelationships among attributes, and then develop clustering methods suitable for grouping interrelated attributes. The “R2 with PCA” method lays more stress on the linear relationships between two clusters, while the “Variance explanation” method focuses not only on interrelations among attributes but also on attributes variations. This research also proposes the statistics for the evaluation of the cluster quality, and these statistics take into considerations the interrelationships among clusters and the variances explained of clusters. Finally, we apply these novel methods to two cases; one is 19 blood tests of 24 human; and the other is Down syndrome microarray data.
author2	陳正剛
author_facet	陳正剛 Chen-Sui Lin 林辰穗
author	Chen-Sui Lin 林辰穗
spellingShingle	Chen-Sui Lin 林辰穗 Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes
author_sort	Chen-Sui Lin
title	Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes
title_short	Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes
title_full	Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes
title_fullStr	Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes
title_full_unstemmed	Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes
title_sort	clustering analysis by attributes interrelations and its application to clustering of differentially expressed genes
publishDate	2005
url	http://ndltd.ncl.edu.tw/handle/50338512156095774744
work_keys_str_mv	AT chensuilin clusteringanalysisbyattributesinterrelationsanditsapplicationtoclusteringofdifferentiallyexpressedgenes AT línchénsuì clusteringanalysisbyattributesinterrelationsanditsapplicationtoclusteringofdifferentiallyexpressedgenes AT chensuilin kǎolǜxiāngguānxìngzhīqúnjífēnxījíqízàijīyīnfēnqúnshàngdeyīngyòng AT línchénsuì kǎolǜxiāngguānxìngzhīqúnjífēnxījíqízàijīyīnfēnqúnshàngdeyīngyòng
_version_	1718153661225369600

Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes

Similar Items