Mining changes of patterns from multi-period datasets

碩士 === 國立成功大學 === 資訊管理研究所 === 95 === Pattern discovery is a common task in data mining. Given the transaction datasets of multi periods, we are concerned with a temporal data mining problem that detects any pattern of interested changes that have been consistent from some period to the last period....

Full description

Bibliographic Details
Main Authors: Chun-Ta Dai, 戴群達
Other Authors: I-Lin Wang
Format: Others
Language:zh-TW
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/76751157086742350546
id ndltd-TW-095NCKU5396024
record_format oai_dc
spelling ndltd-TW-095NCKU53960242015-10-13T13:59:57Z http://ndltd.ncl.edu.tw/handle/76751157086742350546 Mining changes of patterns from multi-period datasets 探勘多期資料下樣式變化之研究 Chun-Ta Dai 戴群達 碩士 國立成功大學 資訊管理研究所 95 Pattern discovery is a common task in data mining. Given the transaction datasets of multi periods, we are concerned with a temporal data mining problem that detects any pattern of interested changes that have been consistent from some period to the last period. Discovering such changes from the transaction database of multi periods will help the managers to detect the tendency of customer needs so that potential customers may be identified. To the best of our knowledge, previous studies in change mining only focus on datasets of two datasets, although the tendency of changes are more meaningful for datasets of multi periods in real-world applications. Conventional data mining techniques that seek frequent patterns could be modified for mining changes from datasets of multi periods, but such approaches would require many pairwise comparisons between datasets of consecutive periods and thus not so efficient. In this thesis, we propose an algorithm called MCP for mining changes from multi-period datasets. MCP is based on a novel data structure modified from the popular frequent-pattern tree(FP-tree), and seeks the target patterns in a very efficient way. In particular, starting from the last two periods, our algorithm first constructs a candidate-pattern forest (CP-forest) to store those patterns of qualified changes, and then iteratively updates the CP-forest using the dataset of each period. The CP-forest is carefully designed such that useless information will not be stored and qualified patterns can be easily identified by tree traversals. Computational experiments have been conducted to compare MCP and another algorithm called modiFP which is modified from the popular FP-growth algorithm for mining the changes of patterns from multi-period datasets. Several parameters have be used to evaluate the performance of MCP and modiFP, and the results show that MCP is much more efficient than modiFP, especially when the number of periods increases or when the datasets of consecutive periods share more similarities. I-Lin Wang 王逸琳 2007 學位論文 ; thesis 66 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 資訊管理研究所 === 95 === Pattern discovery is a common task in data mining. Given the transaction datasets of multi periods, we are concerned with a temporal data mining problem that detects any pattern of interested changes that have been consistent from some period to the last period. Discovering such changes from the transaction database of multi periods will help the managers to detect the tendency of customer needs so that potential customers may be identified. To the best of our knowledge, previous studies in change mining only focus on datasets of two datasets, although the tendency of changes are more meaningful for datasets of multi periods in real-world applications. Conventional data mining techniques that seek frequent patterns could be modified for mining changes from datasets of multi periods, but such approaches would require many pairwise comparisons between datasets of consecutive periods and thus not so efficient. In this thesis, we propose an algorithm called MCP for mining changes from multi-period datasets. MCP is based on a novel data structure modified from the popular frequent-pattern tree(FP-tree), and seeks the target patterns in a very efficient way. In particular, starting from the last two periods, our algorithm first constructs a candidate-pattern forest (CP-forest) to store those patterns of qualified changes, and then iteratively updates the CP-forest using the dataset of each period. The CP-forest is carefully designed such that useless information will not be stored and qualified patterns can be easily identified by tree traversals. Computational experiments have been conducted to compare MCP and another algorithm called modiFP which is modified from the popular FP-growth algorithm for mining the changes of patterns from multi-period datasets. Several parameters have be used to evaluate the performance of MCP and modiFP, and the results show that MCP is much more efficient than modiFP, especially when the number of periods increases or when the datasets of consecutive periods share more similarities.
author2 I-Lin Wang
author_facet I-Lin Wang
Chun-Ta Dai
戴群達
author Chun-Ta Dai
戴群達
spellingShingle Chun-Ta Dai
戴群達
Mining changes of patterns from multi-period datasets
author_sort Chun-Ta Dai
title Mining changes of patterns from multi-period datasets
title_short Mining changes of patterns from multi-period datasets
title_full Mining changes of patterns from multi-period datasets
title_fullStr Mining changes of patterns from multi-period datasets
title_full_unstemmed Mining changes of patterns from multi-period datasets
title_sort mining changes of patterns from multi-period datasets
publishDate 2007
url http://ndltd.ncl.edu.tw/handle/76751157086742350546
work_keys_str_mv AT chuntadai miningchangesofpatternsfrommultiperioddatasets
AT dàiqúndá miningchangesofpatternsfrommultiperioddatasets
AT chuntadai tànkānduōqīzīliàoxiàyàngshìbiànhuàzhīyánjiū
AT dàiqúndá tànkānduōqīzīliàoxiàyàngshìbiànhuàzhīyánjiū
_version_ 1717747375505670144