Summary: | 碩士 === 元智大學 === 資訊管理學系 === 90 === In the field of data mining, finding sequential patterns in sequences database is respected considerably. By sequential patterns, we can understand the characters of sequences in sequences database. With regard to sequential patterns of research are very much recently, and almost research focus on algorithm of improvement. But applications of sequential patterns would be less. If the numbers of sequential patterns are too many, then it is difficult to utilize in analyzing and forecasting of sequential patterns effectively. Therefore, Tadeusz Morzy brings up the method, the Pattern Oriented Partial Clustering (POPC_GA), for clustering sequential patterns. The means of partial clustering is every sequence can belong to more than one cluster. However, user hope to generate hard clusters on certain time.
In this thesis, We propose the Pattern Oriented Hard Clustering (POHC) algorithm, which is based on K-means technology, to cluster sequences. POHC differs from POPC_GA in being better performance and finding out the same characters of sequential patterns. Besides, we propose parallel algorithm on POHC and POPC_GA to improve their performance.
|