A Density-based Multistage Clustering Algorithm
碩士 === 國立臺灣科技大學 === 資訊管理系 === 98 === With the increase of the e-commerce in recent years, large amounts of enterprises start to invest in computerization and thus generate a tremendous amount of data. For the managers, it will be a great benefit for the enterprise if useful information can be extr...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2010
|
Online Access: | http://ndltd.ncl.edu.tw/handle/37994188459093532207 |
id |
ndltd-TW-098NTUS5396042 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-098NTUS53960422016-04-22T04:23:45Z http://ndltd.ncl.edu.tw/handle/37994188459093532207 A Density-based Multistage Clustering Algorithm 以資料密度為基礎的多階段分群演算法 Chun-hao Chuang 莊峻豪 碩士 國立臺灣科技大學 資訊管理系 98 With the increase of the e-commerce in recent years, large amounts of enterprises start to invest in computerization and thus generate a tremendous amount of data. For the managers, it will be a great benefit for the enterprise if useful information can be extracted from these raw data. Therefore, data mining has become one of important and popular research domains. Clustering algorithms can recognize and partition data according to their attributes’ characteristics without defining any categorization information in advance. Therefore, clustering algorithms play an important role in data mining, where the goal is to maximize the homogeneity of objects within the clusters while also maximize the heterogeneity between clusters. Fuzzy C-Means is one of the most popular clustering algorithms, where the number of clusters should be given in advance. Even though we assign the number of clusters, the result of clustering may fall into the local optimum. In addition, because the initial cluster centers are determined by random, the result of each execution may be different. Furthermore, if the data contains noise, it will induce more significant impact on the results, so it is important to suitably select the initial cluster centers. Although the influence of randomly selecting centers can be reduced if the subtractive method is adopted, it is still difficult to deal with the non-spherical shape clusters. If we use the FCM algorithm with the hierarchical algorithm, it can deal with the non-spherical shape clusters, but it cannot easily handle noise. Hence, this paper proposes a density-based multistage clustering algorithm, combining with subtractive clustering, fuzzy clustering and hierarchical clustering methods for solving the problems mentioned above. There are three stages in the approach. The first stage is to suitably select the initial cluster centers; the second stage is to modify the distribution of data points; and the third stage is to merge clusters until appropriate number of clusters is achieved. The experimental results show that our proposed method can improve the performance of clustering. Chiun-chieh Hsu 徐俊傑 2010 學位論文 ; thesis 54 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣科技大學 === 資訊管理系 === 98 === With the increase of the e-commerce in recent years, large amounts of enterprises start to invest in computerization and thus generate a tremendous amount of data. For the managers, it will be a great benefit for the enterprise if useful information can be extracted from these raw data. Therefore, data mining has become one of important and popular research domains.
Clustering algorithms can recognize and partition data according to their attributes’ characteristics without defining any categorization information in advance. Therefore, clustering algorithms play an important role in data mining, where the goal is to maximize the homogeneity of objects within the clusters while also maximize the heterogeneity between clusters. Fuzzy C-Means is one of the most popular clustering algorithms, where the number of clusters should be given in advance. Even though we assign the number of clusters, the result of clustering may fall into the local optimum. In addition, because the initial cluster centers are determined by random, the result of each execution may be different. Furthermore, if the data contains noise, it will induce more significant impact on the results, so it is important to suitably select the initial cluster centers. Although the influence of randomly selecting centers can be reduced if the subtractive method is adopted, it is still difficult to deal with the non-spherical shape clusters. If we use the FCM algorithm with the hierarchical algorithm, it can deal with the non-spherical shape clusters, but it cannot easily handle noise.
Hence, this paper proposes a density-based multistage clustering algorithm, combining with subtractive clustering, fuzzy clustering and hierarchical clustering methods for solving the problems mentioned above. There are three stages in the approach. The first stage is to suitably select the initial cluster centers; the second stage is to modify the distribution of data points; and the third stage is to merge clusters until appropriate number of clusters is achieved. The experimental results show that our proposed method can improve the performance of clustering.
|
author2 |
Chiun-chieh Hsu |
author_facet |
Chiun-chieh Hsu Chun-hao Chuang 莊峻豪 |
author |
Chun-hao Chuang 莊峻豪 |
spellingShingle |
Chun-hao Chuang 莊峻豪 A Density-based Multistage Clustering Algorithm |
author_sort |
Chun-hao Chuang |
title |
A Density-based Multistage Clustering Algorithm |
title_short |
A Density-based Multistage Clustering Algorithm |
title_full |
A Density-based Multistage Clustering Algorithm |
title_fullStr |
A Density-based Multistage Clustering Algorithm |
title_full_unstemmed |
A Density-based Multistage Clustering Algorithm |
title_sort |
density-based multistage clustering algorithm |
publishDate |
2010 |
url |
http://ndltd.ncl.edu.tw/handle/37994188459093532207 |
work_keys_str_mv |
AT chunhaochuang adensitybasedmultistageclusteringalgorithm AT zhuāngjùnháo adensitybasedmultistageclusteringalgorithm AT chunhaochuang yǐzīliàomìdùwèijīchǔdeduōjiēduànfēnqúnyǎnsuànfǎ AT zhuāngjùnháo yǐzīliàomìdùwèijīchǔdeduōjiēduànfēnqúnyǎnsuànfǎ AT chunhaochuang densitybasedmultistageclusteringalgorithm AT zhuāngjùnháo densitybasedmultistageclusteringalgorithm |
_version_ |
1718231260441083904 |