Applying the Support Vector Regression to the Missing Value Problems

碩士 === 華梵大學 === 資訊管理學系碩士班 === 98 === Data Mining is now widespread used for many enterprises. There could be missing data from paperwork to electronic system because of human error or out–of–date information. Usually these data might be deleted or using average value, 0 and mode value to fill the mi...

Full description

Bibliographic Details
Main Authors: Hsi–An Chen, 陳璽安
Other Authors: Zne–Jung Lee
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/02368330022191248233
Description
Summary:碩士 === 華梵大學 === 資訊管理學系碩士班 === 98 === Data Mining is now widespread used for many enterprises. There could be missing data from paperwork to electronic system because of human error or out–of–date information. Usually these data might be deleted or using average value, 0 and mode value to fill the missing values, but this can only applicable for fewer data. It will certainly affect the accuracy of data and ultimately unable to provide reliable information to the user. This thesis use open datasets in the test. It use some data with missing values at random from the open datasets, then use average value, 0, Back–propagation Network (BPN) and Support Vector Regression (SVR) to analyze numerical backfill. Finally this thesis use regression tree to analyze the comparisons. The result shows that anticipation value by using SVR has the closest average error to the original value for missing value.