Summary: | 碩士 === 國立臺灣師範大學 === 資訊教育學系 === 95 === Because of progressing of various electronic equipments, more and more data of applications is collected quickly and constantly to form a data stream. Two challenges arise when performing item predictions in a data stream. The first one is that the data is continuously inputted in high-speed, such that it is required to perform the processing efficiently. Besides, the data distribution and the implicit patterns might change over time. In this thesis, a structure named prediction-tree is proposed to discover prediction rules from repeating patterns in the training data quickly. For adapting the concept changes, it is necessary to generate new prediction rules by re-mining repeating patterns in the most recent sliding window. The first approach, named ERT, is to monitor the accuracy of predictions in a sliding window for detecting the concept changes. When the error rate in a sliding window is higher than a given threshold value, new prediction rules are generated by re-mining repeating patterns. Then the previous prediction rules with high accuracy are remained to be combined with the new generated ones. The other approach is to trigger the re-mining every other non-overlapping data window. Two variations of the window-based triggering approach, named WANR and WRNR, are provided according to whether the previous rules are remained to be combined with the new ones or not. The experimental results show that the error rate of WRNR is slightly higher than the others. However, ERT is the most efficient and effective one among the three methods because it needs to adjust rules only when the concept changes are detected.
|