A Study of the Characteristics Incremental Hierarchical Data Clustering based on Spherical Clusters

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 90 === The advantage of using a hierarchical data clustering algorithm is that we can get and preserve the information of the data merging process by building a dendrogram, while a partitional data clustering algorithm just shows the final clustering results. However,...

Full description

Bibliographic Details
Main Authors: Pao-Hsu, Wang, 王寶絮
Other Authors: 歐陽彥正
Format: Others
Language:zh-TW
Published: 2002
Online Access:http://ndltd.ncl.edu.tw/handle/88989151577357541182
Description
Summary:碩士 === 國立臺灣大學 === 資訊工程學研究所 === 90 === The advantage of using a hierarchical data clustering algorithm is that we can get and preserve the information of the data merging process by building a dendrogram, while a partitional data clustering algorithm just shows the final clustering results. However, as the conventional hierarchical data clustering algorithms suffer O(n2) or higher time complexity, it becomes almost impractical to employ the conventional hierarchical data clustering algorithms in handling large databases. One approach to tackle this problem is to employ an incremental clustering algorithm. This thesis studies the characteristics of a novel incremental clustering algorithm that was proposed by our research team. The incremental clustering algorithm operates based on subclusters of the spherical shape. In the incremental clustering algorithm, the influence of a spherical subcluster is modeled by a spherical Gaussian function. Experimental results reveal that the incremental clustering algorithm offers the following desirable features: 1. generally operates in linear time complexity. 2. is able to identify clusters of arbitrary shapes more effectively than the existing incremental clustering algorithm. 3. requires no sophisticated parameter tuning. 4. suffers less degree of order dependence than the existing incremental clustering algorithm.