A Two-staged Clustering Algorithm with Multiple Scales

碩士 === 元智大學 === 資訊管理研究所 === 91 === Cluster analysis is a kind of data mining techniques, and its goal is to find the hidden patterns from the data. In related studies, most of the reseachera use equal weight to cluster data and only use metric calculation to deal with four kinds of scales .We believ...

Full description

Bibliographic Details
Main Authors: Rung-Ting Chien, 簡榮廷
Other Authors: Chien-Lung Chan
Format: Others
Language:zh-TW
Published: 2003
Online Access:http://ndltd.ncl.edu.tw/handle/15988641902381400969
Description
Summary:碩士 === 元智大學 === 資訊管理研究所 === 91 === Cluster analysis is a kind of data mining techniques, and its goal is to find the hidden patterns from the data. In related studies, most of the reseachera use equal weight to cluster data and only use metric calculation to deal with four kinds of scales .We believe traditional clustering algorithm can be incorporated with expert''s subjective judgment. And different scales -- Nominal, Ordinal, Interval and Ratio, should have different methods to calculate the degree of similarity. So we try to combine expert''s weight and multi-scale into clustering process. Our purpose is to solve the problems that clustering result is hard to explain and result can''t meet the decision marker''s need. In this paper, we propose a two-staged clustering algorithm to solve these problems. In the first-staged, we use the training data to find some parameters that can improve our cluster quality. And we cluster all data and these parameters in the second-staged. In our algorithm, we use multi-scales and unequal weight to calculate all kinds of data and use four standard data sets (Wisconsin Breast Cancer Data, Contraceptive Method Choice Data, Iris Education Data and Balance Scale Weight & Distance Data) to test our algorithm. In the end we find better quality of clustering results in using multi-scale and better prediction with expert''s weight, we find two conclusions in our experiments. First, clustering use multiple scale calculation can improve the quality of similarity within group and dissimilarity between groups. Second, clustering with expert''s weight has better prediction than clustering with equal weight. So we believe multi-scales with expert''s weight clustering algorithm can not only improve clustering quality but also meets decision marker''s requirement