A Pattern-based Method for Item Predictions over Data Streams

碩士 === 國立臺灣師範大學 === 資訊教育學系 === 95 === Because of progressing of various electronic equipments, more and more data of applications is collected quickly and constantly to form a data stream. Two challenges arise when performing item predictions in a data stream. The first one is that the data is conti...

Full description

Bibliographic Details
Main Authors: Tsui-Feng Yen, 嚴翠鳳
Other Authors: Jia-Ling Koh
Format: Others
Language:zh-TW
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/10499635741510175578
id ndltd-TW-095NTNU5395050
record_format oai_dc
spelling ndltd-TW-095NTNU53950502015-12-07T04:03:42Z http://ndltd.ncl.edu.tw/handle/10499635741510175578 A Pattern-based Method for Item Predictions over Data Streams 資料流序列中資料項預測方法之研究 Tsui-Feng Yen 嚴翠鳳 碩士 國立臺灣師範大學 資訊教育學系 95 Because of progressing of various electronic equipments, more and more data of applications is collected quickly and constantly to form a data stream. Two challenges arise when performing item predictions in a data stream. The first one is that the data is continuously inputted in high-speed, such that it is required to perform the processing efficiently. Besides, the data distribution and the implicit patterns might change over time. In this thesis, a structure named prediction-tree is proposed to discover prediction rules from repeating patterns in the training data quickly. For adapting the concept changes, it is necessary to generate new prediction rules by re-mining repeating patterns in the most recent sliding window. The first approach, named ERT, is to monitor the accuracy of predictions in a sliding window for detecting the concept changes. When the error rate in a sliding window is higher than a given threshold value, new prediction rules are generated by re-mining repeating patterns. Then the previous prediction rules with high accuracy are remained to be combined with the new generated ones. The other approach is to trigger the re-mining every other non-overlapping data window. Two variations of the window-based triggering approach, named WANR and WRNR, are provided according to whether the previous rules are remained to be combined with the new ones or not. The experimental results show that the error rate of WRNR is slightly higher than the others. However, ERT is the most efficient and effective one among the three methods because it needs to adjust rules only when the concept changes are detected. Jia-Ling Koh 柯佳伶 2007 學位論文 ; thesis 45 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣師範大學 === 資訊教育學系 === 95 === Because of progressing of various electronic equipments, more and more data of applications is collected quickly and constantly to form a data stream. Two challenges arise when performing item predictions in a data stream. The first one is that the data is continuously inputted in high-speed, such that it is required to perform the processing efficiently. Besides, the data distribution and the implicit patterns might change over time. In this thesis, a structure named prediction-tree is proposed to discover prediction rules from repeating patterns in the training data quickly. For adapting the concept changes, it is necessary to generate new prediction rules by re-mining repeating patterns in the most recent sliding window. The first approach, named ERT, is to monitor the accuracy of predictions in a sliding window for detecting the concept changes. When the error rate in a sliding window is higher than a given threshold value, new prediction rules are generated by re-mining repeating patterns. Then the previous prediction rules with high accuracy are remained to be combined with the new generated ones. The other approach is to trigger the re-mining every other non-overlapping data window. Two variations of the window-based triggering approach, named WANR and WRNR, are provided according to whether the previous rules are remained to be combined with the new ones or not. The experimental results show that the error rate of WRNR is slightly higher than the others. However, ERT is the most efficient and effective one among the three methods because it needs to adjust rules only when the concept changes are detected.
author2 Jia-Ling Koh
author_facet Jia-Ling Koh
Tsui-Feng Yen
嚴翠鳳
author Tsui-Feng Yen
嚴翠鳳
spellingShingle Tsui-Feng Yen
嚴翠鳳
A Pattern-based Method for Item Predictions over Data Streams
author_sort Tsui-Feng Yen
title A Pattern-based Method for Item Predictions over Data Streams
title_short A Pattern-based Method for Item Predictions over Data Streams
title_full A Pattern-based Method for Item Predictions over Data Streams
title_fullStr A Pattern-based Method for Item Predictions over Data Streams
title_full_unstemmed A Pattern-based Method for Item Predictions over Data Streams
title_sort pattern-based method for item predictions over data streams
publishDate 2007
url http://ndltd.ncl.edu.tw/handle/10499635741510175578
work_keys_str_mv AT tsuifengyen apatternbasedmethodforitempredictionsoverdatastreams
AT yáncuìfèng apatternbasedmethodforitempredictionsoverdatastreams
AT tsuifengyen zīliàoliúxùlièzhōngzīliàoxiàngyùcèfāngfǎzhīyánjiū
AT yáncuìfèng zīliàoliúxùlièzhōngzīliàoxiàngyùcèfāngfǎzhīyánjiū
AT tsuifengyen patternbasedmethodforitempredictionsoverdatastreams
AT yáncuìfèng patternbasedmethodforitempredictionsoverdatastreams
_version_ 1718145498429259776