A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies

博士 === 國立清華大學 === 工業工程與工程管理學系 === 95 === Support vector machine (SVM), rough set theory (RST) and decision tree (DT) are methodologies applied to various data mining problems, especially for classification prediction tasks. Studies have shown the ability of RST for feature selection while SVM and DT...

Full description

Bibliographic Details
Main Authors:	Li-Fei Chen, 陳麗妃
Other Authors:	Chen-Fu Chien
Format:	Others
Language:	en_US
Published:	2007
Online Access:	http://ndltd.ncl.edu.tw/handle/30869955008789719497

id	ndltd-TW-095NTHU5031009
record_format	oai_dc
spelling	ndltd-TW-095NTHU50310092016-05-25T04:14:03Z http://ndltd.ncl.edu.tw/handle/30869955008789719497 A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies 整合約略集合論、支援向量機與決策樹之資料挖礦架構及其個案研究 Li-Fei Chen 陳麗妃博士國立清華大學工業工程與工程管理學系 95 Support vector machine (SVM), rough set theory (RST) and decision tree (DT) are methodologies applied to various data mining problems, especially for classification prediction tasks. Studies have shown the ability of RST for feature selection while SVM and DT are significantly on their predictive power. This research aims to integrate the advantages of SVM, RST and DT approaches to develop a hybrid framework to enhance the quality of class prediction as well as rule generation. In addition to build up a classification model with acceptable accuracy, the capability to explain and explore how the decision made with simple, understandable and useful rules is a critical issue for human resource management. DT and RST can generate such rules, however, SVM can not offer such function. The major concept consists of four main stages. The first stage is to select most important attributes. RST is applied to eliminate the redundant and irrelative attributes without loss of any information about classification. The second stage is to reduce noisy objects, which can be accomplished by cross validation through using SVM. If the new data set would induce data imbalance problem, the rules generated by RST would be used to adjust the class distribution (stage 3). Through the stages described above, a data set with fewer dimensions and higher degree of purity could be screened out with similar class distribution and is used to generate rules by using DT which complete the last stage. In addition, the decisions concern with personnel selection prediction always involve handling data with highly dimensions, uncertainty and complexity, which cause traditional statistical methods suffering from low power of test. For validation, real cases of personnel selection of two high-tech companies containing direct and indirect labors in Hsinchu, Taiwan are studied using the proposed hybrid data mining framework. Implementation results show that the proposed approach is effective and has a better performance than that of traditional SVM, RST and DT. Chen-Fu Chien 簡禎富 2007 學位論文 ; thesis 136 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	博士 === 國立清華大學 === 工業工程與工程管理學系 === 95 === Support vector machine (SVM), rough set theory (RST) and decision tree (DT) are methodologies applied to various data mining problems, especially for classification prediction tasks. Studies have shown the ability of RST for feature selection while SVM and DT are significantly on their predictive power. This research aims to integrate the advantages of SVM, RST and DT approaches to develop a hybrid framework to enhance the quality of class prediction as well as rule generation. In addition to build up a classification model with acceptable accuracy, the capability to explain and explore how the decision made with simple, understandable and useful rules is a critical issue for human resource management. DT and RST can generate such rules, however, SVM can not offer such function. The major concept consists of four main stages. The first stage is to select most important attributes. RST is applied to eliminate the redundant and irrelative attributes without loss of any information about classification. The second stage is to reduce noisy objects, which can be accomplished by cross validation through using SVM. If the new data set would induce data imbalance problem, the rules generated by RST would be used to adjust the class distribution (stage 3). Through the stages described above, a data set with fewer dimensions and higher degree of purity could be screened out with similar class distribution and is used to generate rules by using DT which complete the last stage. In addition, the decisions concern with personnel selection prediction always involve handling data with highly dimensions, uncertainty and complexity, which cause traditional statistical methods suffering from low power of test. For validation, real cases of personnel selection of two high-tech companies containing direct and indirect labors in Hsinchu, Taiwan are studied using the proposed hybrid data mining framework. Implementation results show that the proposed approach is effective and has a better performance than that of traditional SVM, RST and DT.
author2	Chen-Fu Chien
author_facet	Chen-Fu Chien Li-Fei Chen 陳麗妃
author	Li-Fei Chen 陳麗妃
spellingShingle	Li-Fei Chen 陳麗妃 A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies
author_sort	Li-Fei Chen
title	A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies
title_short	A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies
title_full	A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies
title_fullStr	A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies
title_full_unstemmed	A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies
title_sort	hybrid data mining framework with rough set theory, support vector machine, and decision tree and its case studies
publishDate	2007
url	http://ndltd.ncl.edu.tw/handle/30869955008789719497
work_keys_str_mv	AT lifeichen ahybriddataminingframeworkwithroughsettheorysupportvectormachineanddecisiontreeanditscasestudies AT chénlìfēi ahybriddataminingframeworkwithroughsettheorysupportvectormachineanddecisiontreeanditscasestudies AT lifeichen zhěnghéyuēlüèjíhélùnzhīyuánxiàngliàngjīyǔjuécèshùzhīzīliàowākuàngjiàgòujíqígèànyánjiū AT chénlìfēi zhěnghéyuēlüèjíhélùnzhīyuánxiàngliàngjīyǔjuécèshùzhīzīliàowākuàngjiàgòujíqígèànyánjiū AT lifeichen hybriddataminingframeworkwithroughsettheorysupportvectormachineanddecisiontreeanditscasestudies AT chénlìfēi hybriddataminingframeworkwithroughsettheorysupportvectormachineanddecisiontreeanditscasestudies
_version_	1718279981523533824

A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies

Similar Items