Clustering based on Principal Component Analysis and Pseudoinverse Transformation

碩士 === 國立臺灣海洋大學 === 電機工程學系 === 96 === This thesis presents a clustering pre-process that utilizes PCA or SVD transformation to improve current projection-based clustering algorithms. Effectiveness of the pre-process is demonstrated by incorporating the pre-process with a projection-based method call...

Full description

Bibliographic Details
Main Authors:	Sih-Yin Shen, 沈思吟
Other Authors:	Jung-Hua Wang
Format:	Others
Language:	en_US
Published:	2008
Online Access:	http://ndltd.ncl.edu.tw/handle/88980614390164214633

id	ndltd-TW-096NTOU5442064
record_format	oai_dc
spelling	ndltd-TW-096NTOU54420642016-04-27T04:11:26Z http://ndltd.ncl.edu.tw/handle/88980614390164214633 Clustering based on Principal Component Analysis and Pseudoinverse Transformation 基於主成分分析及虛擬反矩陣之分群演算法 Sih-Yin Shen 沈思吟碩士國立臺灣海洋大學電機工程學系 96 This thesis presents a clustering pre-process that utilizes PCA or SVD transformation to improve current projection-based clustering algorithms. Effectiveness of the pre-process is demonstrated by incorporating the pre-process with a projection-based method called DEPIT (Dimension Extension and Pseudo-Inverse Transformation). By doing so, the performance of DEPIT can be greatly improved in dealing with 2-D and 3-D input data. In [7], it was shown that performance of DEPIT is greatly affected by the form of data distribution, hence carefully analyzing the structure of input data is essential. The pre-processing technique employs PCA or SVD to transform input data from the original space to another space spanned by the eigenvectors or singular vectors, namely each data point is represented as linear combination of eigenvectors or singular vectors. The proposed pre-processing technique can enhance the applicability of DEPIT in dealing with various data distributions having different forms, regardless if there exists a dominant principal component. More significantly, the tedious rotation schedule and voting process in the original DEPIT are altogether dispensable. Issue of how dominant component affects the clustering result is also addressed. First, the existence of a heavily dominated component is detected, if not so, the transformed data set will be stretched or shrunken by applying an F operator. We also show that Mahalanobis metric outperforms Euclidean measure in updating centroids, as the former is more robust against outlier and more accurate in dealing with various data distributions. Jung-Hua Wang 王榮華 2008 學位論文 ; thesis 50 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立臺灣海洋大學 === 電機工程學系 === 96 === This thesis presents a clustering pre-process that utilizes PCA or SVD transformation to improve current projection-based clustering algorithms. Effectiveness of the pre-process is demonstrated by incorporating the pre-process with a projection-based method called DEPIT (Dimension Extension and Pseudo-Inverse Transformation). By doing so, the performance of DEPIT can be greatly improved in dealing with 2-D and 3-D input data. In [7], it was shown that performance of DEPIT is greatly affected by the form of data distribution, hence carefully analyzing the structure of input data is essential. The pre-processing technique employs PCA or SVD to transform input data from the original space to another space spanned by the eigenvectors or singular vectors, namely each data point is represented as linear combination of eigenvectors or singular vectors. The proposed pre-processing technique can enhance the applicability of DEPIT in dealing with various data distributions having different forms, regardless if there exists a dominant principal component. More significantly, the tedious rotation schedule and voting process in the original DEPIT are altogether dispensable. Issue of how dominant component affects the clustering result is also addressed. First, the existence of a heavily dominated component is detected, if not so, the transformed data set will be stretched or shrunken by applying an F operator. We also show that Mahalanobis metric outperforms Euclidean measure in updating centroids, as the former is more robust against outlier and more accurate in dealing with various data distributions.
author2	Jung-Hua Wang
author_facet	Jung-Hua Wang Sih-Yin Shen 沈思吟
author	Sih-Yin Shen 沈思吟
spellingShingle	Sih-Yin Shen 沈思吟 Clustering based on Principal Component Analysis and Pseudoinverse Transformation
author_sort	Sih-Yin Shen
title	Clustering based on Principal Component Analysis and Pseudoinverse Transformation
title_short	Clustering based on Principal Component Analysis and Pseudoinverse Transformation
title_full	Clustering based on Principal Component Analysis and Pseudoinverse Transformation
title_fullStr	Clustering based on Principal Component Analysis and Pseudoinverse Transformation
title_full_unstemmed	Clustering based on Principal Component Analysis and Pseudoinverse Transformation
title_sort	clustering based on principal component analysis and pseudoinverse transformation
publishDate	2008
url	http://ndltd.ncl.edu.tw/handle/88980614390164214633
work_keys_str_mv	AT sihyinshen clusteringbasedonprincipalcomponentanalysisandpseudoinversetransformation AT chénsīyín clusteringbasedonprincipalcomponentanalysisandpseudoinversetransformation AT sihyinshen jīyúzhǔchéngfēnfēnxījíxūnǐfǎnjǔzhènzhīfēnqúnyǎnsuànfǎ AT chénsīyín jīyúzhǔchéngfēnfēnxījíxūnǐfǎnjǔzhènzhīfēnqúnyǎnsuànfǎ
_version_	1718249568996425728

Clustering based on Principal Component Analysis and Pseudoinverse Transformation

Similar Items