Summary: | 博士 === 國立成功大學 === 製造資訊與系統研究所 === 102 === Data mining techniques have been widely used in a variety of data-analysis applications in recent years. To find useful rules or patterns in a single long-term time series data, the periodic pattern mining has become a very popular research topic. In real-life examples, "partial” periodic pattern mining is more flexible than “full” periodic patterns. The main reason is that the events of some time positions in a pattern can be uncertain. However, since partial periodic pattern mining can ignore the events of some time positions in a period, it has to generate a large number of candidate patterns in mining. Then, how to develop efficient partial periodic pattern mining algorithms for saving time cost is a critical issue. Besides, since most of studies related to partial periodic pattern mining only consider the supports of items in period segments, many useful patterns with low-frequency but high-significance in event sequence data may not be found. Hence, in this dissertation, we propose not only two efficient projection-based mining algorithms but also the two new issues, respectively named weighted partial periodic pattern mining (WPPP) and partial periodic pattern mining with multiple minimum constraints (PPPMM).
As to the traditional partial periodic pattern mining, the two algorithms, PPA (Projection-based Pattern Mining Approach) and PRA (Pruning Redundancy Approach), were proposed to enhance the execution efficiency in finding partial periodic patterns from a single event sequence with single or multiple events in a time point. Different from the PPA algorithm without any strategies, the PRA algorithm adopts two effective strategies, pruning and filtering, to reduce a large number of candidates in mining. The experimental results on several synthetic and real datasets showed the proposed approaches get up to 70% performance improvement when compared to the traditional MSA (Max-Subpattern Hit Set) algorithm.
For the issue of WPPP, since the downward-closure property cannot be kept in this problem, an effective upper-bound model, which the maximum weight of all events in a period segment as the upper-bound of any sub-pattern in that segment, is developed to achieve this goal. Based on the model, a two-phase mining approach PWA (Projection-based Weighted Mining Approach) is also presented to complete the WPPP mining tasks. For another issue PPPMM, an efficient two-phase mining approach PAMMS (Projection-based Mining Approach with Multiple Minimum Supports) is proposed to handle this problem. Especially, since the downward-closure property is not kept in the problem of PPPMM, the minimum constraint value of all events in mining is used to avoid information losing. Finally, the experimental results show that the performance of both PWA and PAMMS in terms of pruning effectiveness and execution efficiency on synthetic and real datasets.
|