Summary: | 碩士 === 國立臺灣大學 === 資訊管理學研究所 === 96 === As both the number of dimensions and the amount of data increase, existing clustering methods in the full feature space are not good enough to cluster the data in databases. Thus, the subspace clustering has attracted more and more attention recently. In this thesis, we proposed a novel subspace mining method which can simultaneously consider all frequent subspaces to select the significant subspaces. The proposed method consists of three phases. First, we project all data points onto each pair of dimensions and generate frequent subspaces. Second, we join frequent subspaces to form larger ones. Finally, we adopt a greedy algorithm to summarize the frequent subspaces found and select the significant subspaces. The experimental results show that our proposed method outperforms the FIRES and DUSC methods in terms of quality and coverage.
|