A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies
博士 === 國立清華大學 === 工業工程與工程管理學系 === 95 === Support vector machine (SVM), rough set theory (RST) and decision tree (DT) are methodologies applied to various data mining problems, especially for classification prediction tasks. Studies have shown the ability of RST for feature selection while SVM and DT...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2007
|
Online Access: | http://ndltd.ncl.edu.tw/handle/30869955008789719497 |
id |
ndltd-TW-095NTHU5031009 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-095NTHU50310092016-05-25T04:14:03Z http://ndltd.ncl.edu.tw/handle/30869955008789719497 A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies 整合約略集合論、支援向量機與決策樹之資料挖礦架構及其個案研究 Li-Fei Chen 陳麗妃 博士 國立清華大學 工業工程與工程管理學系 95 Support vector machine (SVM), rough set theory (RST) and decision tree (DT) are methodologies applied to various data mining problems, especially for classification prediction tasks. Studies have shown the ability of RST for feature selection while SVM and DT are significantly on their predictive power. This research aims to integrate the advantages of SVM, RST and DT approaches to develop a hybrid framework to enhance the quality of class prediction as well as rule generation. In addition to build up a classification model with acceptable accuracy, the capability to explain and explore how the decision made with simple, understandable and useful rules is a critical issue for human resource management. DT and RST can generate such rules, however, SVM can not offer such function. The major concept consists of four main stages. The first stage is to select most important attributes. RST is applied to eliminate the redundant and irrelative attributes without loss of any information about classification. The second stage is to reduce noisy objects, which can be accomplished by cross validation through using SVM. If the new data set would induce data imbalance problem, the rules generated by RST would be used to adjust the class distribution (stage 3). Through the stages described above, a data set with fewer dimensions and higher degree of purity could be screened out with similar class distribution and is used to generate rules by using DT which complete the last stage. In addition, the decisions concern with personnel selection prediction always involve handling data with highly dimensions, uncertainty and complexity, which cause traditional statistical methods suffering from low power of test. For validation, real cases of personnel selection of two high-tech companies containing direct and indirect labors in Hsinchu, Taiwan are studied using the proposed hybrid data mining framework. Implementation results show that the proposed approach is effective and has a better performance than that of traditional SVM, RST and DT. Chen-Fu Chien 簡禎富 2007 學位論文 ; thesis 136 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立清華大學 === 工業工程與工程管理學系 === 95 === Support vector machine (SVM), rough set theory (RST) and decision tree (DT) are methodologies applied to various data mining problems, especially for classification prediction tasks. Studies have shown the ability of RST for feature selection while SVM and DT are significantly on their predictive power. This research aims to integrate the advantages of SVM, RST and DT approaches to develop a hybrid framework to enhance the quality of class prediction as well as rule generation. In addition to build up a classification model with acceptable accuracy, the capability to explain and explore how the decision made with simple, understandable and useful rules is a critical issue for human resource management. DT and RST can generate such rules, however, SVM can not offer such function. The major concept consists of four main stages. The first stage is to select most important attributes. RST is applied to eliminate the redundant and irrelative attributes without loss of any information about classification. The second stage is to reduce noisy objects, which can be accomplished by cross validation through using SVM. If the new data set would induce data imbalance problem, the rules generated by RST would be used to adjust the class distribution (stage 3). Through the stages described above, a data set with fewer dimensions and higher degree of purity could be screened out with similar class distribution and is used to generate rules by using DT which complete the last stage. In addition, the decisions concern with personnel selection prediction always involve handling data with highly dimensions, uncertainty and complexity, which cause traditional statistical methods suffering from low power of test. For validation, real cases of personnel selection of two high-tech companies containing direct and indirect labors in Hsinchu, Taiwan are studied using the proposed hybrid data mining framework. Implementation results show that the proposed approach is effective and has a better performance than that of traditional SVM, RST and DT.
|
author2 |
Chen-Fu Chien |
author_facet |
Chen-Fu Chien Li-Fei Chen 陳麗妃 |
author |
Li-Fei Chen 陳麗妃 |
spellingShingle |
Li-Fei Chen 陳麗妃 A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies |
author_sort |
Li-Fei Chen |
title |
A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies |
title_short |
A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies |
title_full |
A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies |
title_fullStr |
A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies |
title_full_unstemmed |
A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies |
title_sort |
hybrid data mining framework with rough set theory, support vector machine, and decision tree and its case studies |
publishDate |
2007 |
url |
http://ndltd.ncl.edu.tw/handle/30869955008789719497 |
work_keys_str_mv |
AT lifeichen ahybriddataminingframeworkwithroughsettheorysupportvectormachineanddecisiontreeanditscasestudies AT chénlìfēi ahybriddataminingframeworkwithroughsettheorysupportvectormachineanddecisiontreeanditscasestudies AT lifeichen zhěnghéyuēlüèjíhélùnzhīyuánxiàngliàngjīyǔjuécèshùzhīzīliàowākuàngjiàgòujíqígèànyánjiū AT chénlìfēi zhěnghéyuēlüèjíhélùnzhīyuánxiàngliàngjīyǔjuécèshùzhīzīliàowākuàngjiàgòujíqígèànyánjiū AT lifeichen hybriddataminingframeworkwithroughsettheorysupportvectormachineanddecisiontreeanditscasestudies AT chénlìfēi hybriddataminingframeworkwithroughsettheorysupportvectormachineanddecisiontreeanditscasestudies |
_version_ |
1718279981523533824 |