The Earliest-Time-Point Approach to Mining Frequent Up-to-Date Patterns in Temporal Databases

碩士 === 國立中山大學 === 資訊工程學系研究所 === 102 ===   Recently, temporal data mining has been considered as an important topic attracting many researchers. Analyzing temporal data and discovering temporal patterns are the main concerns in temporal data mining. Although we can discover temporal patterns by these...

Full description

Bibliographic Details
Main Authors: Hsin-pao Huang, 黃馨葆
Other Authors: Ye-In Chang
Format: Others
Language:en_US
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/f39dd8
Description
Summary:碩士 === 國立中山大學 === 資訊工程學系研究所 === 102 ===   Recently, temporal data mining has been considered as an important topic attracting many researchers. Analyzing temporal data and discovering temporal patterns are the main concerns in temporal data mining. Although we can discover temporal patterns by these proposed algorithms, we only derive frequent patterns in the whole database. Therefore, a new concept of up-to-date patterns is proposed by Hong et al., which only cares about the items or itemsets that are frequent for a flexible period of time from the current time to the oldest past time. Hong et al. also propose the UDP-tree construction algorithm and the UDP-growth mining algorithm to find out all frequent up-to-date patterns. The UDP-tree construction algorithm first derives frequent up-to-date 1-patterns with their frequency and valid lifetime. Second, it constructs an UDP-tree. Then, the UDP-growth mining algorithm is proposed to find out up-to-date k-patterns (k ≥ 2) from the UDP-tree. However, the UDP-tree construction algorithm and the UDP-growth mining algorithm have some problems for finding all frequent up-to-date patterns. First, when they derive frequent up-to-date k-patterns (k ≥ 1), they need many times to check whether the item or itemset is frequent with corresponding lifetime or not. That is, the UDP algorithm needs long execution time for checking whether the item or itemset is frequent up-to-date pattern or not. Second, when checking whether the candidate k-pattern (k ≥ 3) is frequent up-to-date k-pattern or not, the UDP algorithm may check all candidate up-to-date k-patterns by the formula. It also wastes execution time. Third, some of the results which derived from the UDP algorithm are unreasonable. Therefore, to avoid these problems and improve the performance, we propose an Earliest-Time-Point approach to use two pruning strategies in the process of the tree construction algorithm and the mining algorithm. The first pruning strategy is applied to check all items or itemsets are frequent up-to-date k-patterns (k ≥ 1) or not. This strategy could reduce the execution time. The second pruning strategy is used for checking whether the itemsets will be frequent up-to-date k-patterns (k ≥ 3) or not. This strategy may prune some candidates to continue to be checked. That is, we reduce the number of candidates. Third, we propose an extension of the formula for deciding the valid appearance time of the pattern to avoid getting unreasonable results. Thus, our approach is faster than the UDP algorithm to find all frequent up-to-date patterns. Moreover, our approach can avoid getting unreasonable results. From our simulation results, we show that our Earliest-Time-Point approach is more efficient than the UDP algorithm.