Data Mining with Uncertain Data

碩士 === 國立高雄大學 === 電機工程學系碩士班 === 97 === Machine learning and data mining are two kinds of important techniques for extracting valuable information from datasets. Although current mining and learning technologies can handle large amounts of data, the rapid growth of datasets may cause some attribute v...

Full description

Bibliographic Details
Main Authors:	Chih-Wei Wu, 吳志偉
Other Authors:	Tzung-Pei Hong
Format:	Others
Language:	en_US
Published:	2009
Online Access:	http://ndltd.ncl.edu.tw/handle/67289209055615848501

id	ndltd-TW-097NUK05442037
record_format	oai_dc
spelling	ndltd-TW-097NUK054420372016-06-22T04:13:45Z http://ndltd.ncl.edu.tw/handle/67289209055615848501 Data Mining with Uncertain Data 不明確資料之資料挖掘 Chih-Wei Wu 吳志偉碩士國立高雄大學電機工程學系碩士班 97 Machine learning and data mining are two kinds of important techniques for extracting valuable information from datasets. Although current mining and learning technologies can handle large amounts of data, the rapid growth of datasets may cause some attribute values to be missed in the data-gathering process. Incomplete data are usually appropriately handled to improve the quality of the discovered information. Therefore, the problem of recovering missing values from a data set has become an important research issue in the field of data mining and machine learning. In this thesis, we first introduce an iterative missing-value completion method based on the RAR support values to extract useful association rules for inferring missing values in an iterative way. The proposed method can fully infer the missing attribute values by combining an iterative mechanism and data mining techniques. It consists of three phases. The first phase uses the association rules to roughly complete the missing values. The second phase iteratively reduces the minimum support to gather more association rules to complete the rest of missing values. The third phase uses the association rules from the completed dataset to correct the missing values that have been filled in. The proposed approach is then a little modified to consider the partial support values in deriving missing values. The second approach is a little better than the first one because the former uses more information (incomplete tuples) in guessing. Experimental results show both the proposed approaches have good accuracy and data recovery even when the missing-value rate is high. Tzung-Pei Hong 洪宗貝 2009 學位論文 ; thesis 68 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立高雄大學 === 電機工程學系碩士班 === 97 === Machine learning and data mining are two kinds of important techniques for extracting valuable information from datasets. Although current mining and learning technologies can handle large amounts of data, the rapid growth of datasets may cause some attribute values to be missed in the data-gathering process. Incomplete data are usually appropriately handled to improve the quality of the discovered information. Therefore, the problem of recovering missing values from a data set has become an important research issue in the field of data mining and machine learning. In this thesis, we first introduce an iterative missing-value completion method based on the RAR support values to extract useful association rules for inferring missing values in an iterative way. The proposed method can fully infer the missing attribute values by combining an iterative mechanism and data mining techniques. It consists of three phases. The first phase uses the association rules to roughly complete the missing values. The second phase iteratively reduces the minimum support to gather more association rules to complete the rest of missing values. The third phase uses the association rules from the completed dataset to correct the missing values that have been filled in. The proposed approach is then a little modified to consider the partial support values in deriving missing values. The second approach is a little better than the first one because the former uses more information (incomplete tuples) in guessing. Experimental results show both the proposed approaches have good accuracy and data recovery even when the missing-value rate is high.
author2	Tzung-Pei Hong
author_facet	Tzung-Pei Hong Chih-Wei Wu 吳志偉
author	Chih-Wei Wu 吳志偉
spellingShingle	Chih-Wei Wu 吳志偉 Data Mining with Uncertain Data
author_sort	Chih-Wei Wu
title	Data Mining with Uncertain Data
title_short	Data Mining with Uncertain Data
title_full	Data Mining with Uncertain Data
title_fullStr	Data Mining with Uncertain Data
title_full_unstemmed	Data Mining with Uncertain Data
title_sort	data mining with uncertain data
publishDate	2009
url	http://ndltd.ncl.edu.tw/handle/67289209055615848501
work_keys_str_mv	AT chihweiwu dataminingwithuncertaindata AT wúzhìwěi dataminingwithuncertaindata AT chihweiwu bùmíngquèzīliàozhīzīliàowājué AT wúzhìwěi bùmíngquèzīliàozhīzīliàowājué
_version_	1718314175081480192

Data Mining with Uncertain Data

Similar Items