Mining Streaming Data with Concept Drifts

碩士 === 國立嘉義大學 === 資訊工程學系研究所 === 100 === Data mining frequently uses machine learning methods to process data, and these methods need to learn from training data so that they can make predictions on new data. Traditional data mining research on streamed data usually assumes that the data distribution...

Full description

Bibliographic Details
Main Author: 蘇郁喬
Other Authors: 陳耀輝
Format: Others
Language:zh-TW
Online Access:http://ndltd.ncl.edu.tw/handle/64960926653396805165
Description
Summary:碩士 === 國立嘉義大學 === 資訊工程學系研究所 === 100 === Data mining frequently uses machine learning methods to process data, and these methods need to learn from training data so that they can make predictions on new data. Traditional data mining research on streamed data usually assumes that the data distribution is stable. In the real world, however, concept drifts may occur in the continuously incoming data over time. When the quantity of input data increases, storing the enormous amount of training data not only consumes memory space but also increases training time. Handling data that have concept drifts in the traditional way usually mixes all kinds of concept drifts data to select training data, but uses these training data to build model may not be suitable. This research develops an effective and efficient method for selecting useful information from data stream according to data blocks that have common type of concept drift. The experiments on data generated according to both STAGGER and moving hyperplanes show that the proposed method produces better results.