The Earliest-Time-Point Approach to Mining Frequent Up-to-Date Patterns in Temporal Databases

碩士 === 國立中山大學 === 資訊工程學系研究所 === 102 ===   Recently, temporal data mining has been considered as an important topic attracting many researchers. Analyzing temporal data and discovering temporal patterns are the main concerns in temporal data mining. Although we can discover temporal patterns by these...

Full description

Bibliographic Details
Main Authors: Hsin-pao Huang, 黃馨葆
Other Authors: Ye-In Chang
Format: Others
Language:en_US
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/f39dd8
id ndltd-TW-102NSYS5392028
record_format oai_dc
spelling ndltd-TW-102NSYS53920282019-05-15T21:32:36Z http://ndltd.ncl.edu.tw/handle/f39dd8 The Earliest-Time-Point Approach to Mining Frequent Up-to-Date Patterns in Temporal Databases 一個以最早出現時間點來探勘時序資料庫中之最新且頻繁樣式的方法 Hsin-pao Huang 黃馨葆 碩士 國立中山大學 資訊工程學系研究所 102   Recently, temporal data mining has been considered as an important topic attracting many researchers. Analyzing temporal data and discovering temporal patterns are the main concerns in temporal data mining. Although we can discover temporal patterns by these proposed algorithms, we only derive frequent patterns in the whole database. Therefore, a new concept of up-to-date patterns is proposed by Hong et al., which only cares about the items or itemsets that are frequent for a flexible period of time from the current time to the oldest past time. Hong et al. also propose the UDP-tree construction algorithm and the UDP-growth mining algorithm to find out all frequent up-to-date patterns. The UDP-tree construction algorithm first derives frequent up-to-date 1-patterns with their frequency and valid lifetime. Second, it constructs an UDP-tree. Then, the UDP-growth mining algorithm is proposed to find out up-to-date k-patterns (k ≥ 2) from the UDP-tree. However, the UDP-tree construction algorithm and the UDP-growth mining algorithm have some problems for finding all frequent up-to-date patterns. First, when they derive frequent up-to-date k-patterns (k ≥ 1), they need many times to check whether the item or itemset is frequent with corresponding lifetime or not. That is, the UDP algorithm needs long execution time for checking whether the item or itemset is frequent up-to-date pattern or not. Second, when checking whether the candidate k-pattern (k ≥ 3) is frequent up-to-date k-pattern or not, the UDP algorithm may check all candidate up-to-date k-patterns by the formula. It also wastes execution time. Third, some of the results which derived from the UDP algorithm are unreasonable. Therefore, to avoid these problems and improve the performance, we propose an Earliest-Time-Point approach to use two pruning strategies in the process of the tree construction algorithm and the mining algorithm. The first pruning strategy is applied to check all items or itemsets are frequent up-to-date k-patterns (k ≥ 1) or not. This strategy could reduce the execution time. The second pruning strategy is used for checking whether the itemsets will be frequent up-to-date k-patterns (k ≥ 3) or not. This strategy may prune some candidates to continue to be checked. That is, we reduce the number of candidates. Third, we propose an extension of the formula for deciding the valid appearance time of the pattern to avoid getting unreasonable results. Thus, our approach is faster than the UDP algorithm to find all frequent up-to-date patterns. Moreover, our approach can avoid getting unreasonable results. From our simulation results, we show that our Earliest-Time-Point approach is more efficient than the UDP algorithm. Ye-In Chang 張玉盈 2014 學位論文 ; thesis 88 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中山大學 === 資訊工程學系研究所 === 102 ===   Recently, temporal data mining has been considered as an important topic attracting many researchers. Analyzing temporal data and discovering temporal patterns are the main concerns in temporal data mining. Although we can discover temporal patterns by these proposed algorithms, we only derive frequent patterns in the whole database. Therefore, a new concept of up-to-date patterns is proposed by Hong et al., which only cares about the items or itemsets that are frequent for a flexible period of time from the current time to the oldest past time. Hong et al. also propose the UDP-tree construction algorithm and the UDP-growth mining algorithm to find out all frequent up-to-date patterns. The UDP-tree construction algorithm first derives frequent up-to-date 1-patterns with their frequency and valid lifetime. Second, it constructs an UDP-tree. Then, the UDP-growth mining algorithm is proposed to find out up-to-date k-patterns (k ≥ 2) from the UDP-tree. However, the UDP-tree construction algorithm and the UDP-growth mining algorithm have some problems for finding all frequent up-to-date patterns. First, when they derive frequent up-to-date k-patterns (k ≥ 1), they need many times to check whether the item or itemset is frequent with corresponding lifetime or not. That is, the UDP algorithm needs long execution time for checking whether the item or itemset is frequent up-to-date pattern or not. Second, when checking whether the candidate k-pattern (k ≥ 3) is frequent up-to-date k-pattern or not, the UDP algorithm may check all candidate up-to-date k-patterns by the formula. It also wastes execution time. Third, some of the results which derived from the UDP algorithm are unreasonable. Therefore, to avoid these problems and improve the performance, we propose an Earliest-Time-Point approach to use two pruning strategies in the process of the tree construction algorithm and the mining algorithm. The first pruning strategy is applied to check all items or itemsets are frequent up-to-date k-patterns (k ≥ 1) or not. This strategy could reduce the execution time. The second pruning strategy is used for checking whether the itemsets will be frequent up-to-date k-patterns (k ≥ 3) or not. This strategy may prune some candidates to continue to be checked. That is, we reduce the number of candidates. Third, we propose an extension of the formula for deciding the valid appearance time of the pattern to avoid getting unreasonable results. Thus, our approach is faster than the UDP algorithm to find all frequent up-to-date patterns. Moreover, our approach can avoid getting unreasonable results. From our simulation results, we show that our Earliest-Time-Point approach is more efficient than the UDP algorithm.
author2 Ye-In Chang
author_facet Ye-In Chang
Hsin-pao Huang
黃馨葆
author Hsin-pao Huang
黃馨葆
spellingShingle Hsin-pao Huang
黃馨葆
The Earliest-Time-Point Approach to Mining Frequent Up-to-Date Patterns in Temporal Databases
author_sort Hsin-pao Huang
title The Earliest-Time-Point Approach to Mining Frequent Up-to-Date Patterns in Temporal Databases
title_short The Earliest-Time-Point Approach to Mining Frequent Up-to-Date Patterns in Temporal Databases
title_full The Earliest-Time-Point Approach to Mining Frequent Up-to-Date Patterns in Temporal Databases
title_fullStr The Earliest-Time-Point Approach to Mining Frequent Up-to-Date Patterns in Temporal Databases
title_full_unstemmed The Earliest-Time-Point Approach to Mining Frequent Up-to-Date Patterns in Temporal Databases
title_sort earliest-time-point approach to mining frequent up-to-date patterns in temporal databases
publishDate 2014
url http://ndltd.ncl.edu.tw/handle/f39dd8
work_keys_str_mv AT hsinpaohuang theearliesttimepointapproachtominingfrequentuptodatepatternsintemporaldatabases
AT huángxīnbǎo theearliesttimepointapproachtominingfrequentuptodatepatternsintemporaldatabases
AT hsinpaohuang yīgèyǐzuìzǎochūxiànshíjiāndiǎnláitànkānshíxùzīliàokùzhōngzhīzuìxīnqiěpínfányàngshìdefāngfǎ
AT huángxīnbǎo yīgèyǐzuìzǎochūxiànshíjiāndiǎnláitànkānshíxùzīliàokùzhōngzhīzuìxīnqiěpínfányàngshìdefāngfǎ
AT hsinpaohuang earliesttimepointapproachtominingfrequentuptodatepatternsintemporaldatabases
AT huángxīnbǎo earliesttimepointapproachtominingfrequentuptodatepatternsintemporaldatabases
_version_ 1719116292001103872