Summary: | 碩士 === 國立成功大學 === 資訊管理研究所 === 92 === Usually, data clustering is used to be a preliminary step in data mining, especially in the mass and multiple dimensions dataset. After appropriate clustering, useful information can be found in the hidden data. This information can support the enterprise to do problem-solving and decision-making. When the data is mass, using partition clustering algorithm in searching optimal clustering often take a lot of time and cannot generate the appropriate cluster number. The partition clustering algorithm need user to set the initial cluster number which is usually the most difficult part in clustering. Furthermore, when the data description spaces cannot describe the complexity of the data dimensions sufficiently, the algorithm may result in a poor clustering. According to the above description, this research proposes a solution based on PAM algorithm. By combining the heuristic algorithm and the concept of attribute level climbing, the algorithm can decrease the spending time of searching optimal solution and find the appropriate cluster number. Finally, it leads the clustering result more comprehensible and better.
|