Data Mining with Uncertain Data
碩士 === 國立高雄大學 === 電機工程學系碩士班 === 97 === Machine learning and data mining are two kinds of important techniques for extracting valuable information from datasets. Although current mining and learning technologies can handle large amounts of data, the rapid growth of datasets may cause some attribute v...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2009
|
Online Access: | http://ndltd.ncl.edu.tw/handle/67289209055615848501 |
id |
ndltd-TW-097NUK05442037 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-097NUK054420372016-06-22T04:13:45Z http://ndltd.ncl.edu.tw/handle/67289209055615848501 Data Mining with Uncertain Data 不明確資料之資料挖掘 Chih-Wei Wu 吳志偉 碩士 國立高雄大學 電機工程學系碩士班 97 Machine learning and data mining are two kinds of important techniques for extracting valuable information from datasets. Although current mining and learning technologies can handle large amounts of data, the rapid growth of datasets may cause some attribute values to be missed in the data-gathering process. Incomplete data are usually appropriately handled to improve the quality of the discovered information. Therefore, the problem of recovering missing values from a data set has become an important research issue in the field of data mining and machine learning. In this thesis, we first introduce an iterative missing-value completion method based on the RAR support values to extract useful association rules for inferring missing values in an iterative way. The proposed method can fully infer the missing attribute values by combining an iterative mechanism and data mining techniques. It consists of three phases. The first phase uses the association rules to roughly complete the missing values. The second phase iteratively reduces the minimum support to gather more association rules to complete the rest of missing values. The third phase uses the association rules from the completed dataset to correct the missing values that have been filled in. The proposed approach is then a little modified to consider the partial support values in deriving missing values. The second approach is a little better than the first one because the former uses more information (incomplete tuples) in guessing. Experimental results show both the proposed approaches have good accuracy and data recovery even when the missing-value rate is high. Tzung-Pei Hong 洪宗貝 2009 學位論文 ; thesis 68 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立高雄大學 === 電機工程學系碩士班 === 97 === Machine learning and data mining are two kinds of important techniques for extracting valuable information from datasets. Although current mining and learning technologies can handle large amounts of data, the rapid growth of datasets may cause some attribute values to be missed in the data-gathering process. Incomplete data are usually appropriately handled to improve the quality of the discovered information. Therefore, the problem of recovering missing values from a data set has become an important research issue in the field of data mining and machine learning. In this thesis, we first introduce an iterative missing-value completion method based on the RAR support values to extract useful association rules for inferring missing values in an iterative way. The proposed method can fully infer the missing attribute values by combining an iterative mechanism and data mining techniques. It consists of three phases. The first phase uses the association rules to roughly complete the missing values. The second phase iteratively reduces the minimum support to gather more association rules to complete the rest of missing values. The third phase uses the association rules from the completed dataset to correct the missing values that have been filled in. The proposed approach is then a little modified to consider the partial support values in deriving missing values. The second approach is a little better than the first one because the former uses more information (incomplete tuples) in guessing. Experimental results show both the proposed approaches have good accuracy and data recovery even when the missing-value rate is high.
|
author2 |
Tzung-Pei Hong |
author_facet |
Tzung-Pei Hong Chih-Wei Wu 吳志偉 |
author |
Chih-Wei Wu 吳志偉 |
spellingShingle |
Chih-Wei Wu 吳志偉 Data Mining with Uncertain Data |
author_sort |
Chih-Wei Wu |
title |
Data Mining with Uncertain Data |
title_short |
Data Mining with Uncertain Data |
title_full |
Data Mining with Uncertain Data |
title_fullStr |
Data Mining with Uncertain Data |
title_full_unstemmed |
Data Mining with Uncertain Data |
title_sort |
data mining with uncertain data |
publishDate |
2009 |
url |
http://ndltd.ncl.edu.tw/handle/67289209055615848501 |
work_keys_str_mv |
AT chihweiwu dataminingwithuncertaindata AT wúzhìwěi dataminingwithuncertaindata AT chihweiwu bùmíngquèzīliàozhīzīliàowājué AT wúzhìwěi bùmíngquèzīliàozhīzīliàowājué |
_version_ |
1718314175081480192 |