THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS
碩士 === 淡江大學 === 資訊工程學系碩士班 === 95 === The decision tree is one of the key data mining techniques and has been applied to medical applications. A decision tree is built up by selecting the best test attribute as the root of the decision tree. Then, the same procedure is operated on each branch to indu...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2007
|
Online Access: | http://ndltd.ncl.edu.tw/handle/38909883322526889002 |
id |
ndltd-TW-095TKU05392050 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-095TKU053920502015-10-13T14:08:17Z http://ndltd.ncl.edu.tw/handle/38909883322526889002 THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS 決策樹中移除不相關值問題在醫療研究的運用 Nan-Ching Huang 黃南競 碩士 淡江大學 資訊工程學系碩士班 95 The decision tree is one of the key data mining techniques and has been applied to medical applications. A decision tree is built up by selecting the best test attribute as the root of the decision tree. Then, the same procedure is operated on each branch to induce the remaining levels of the decision tree until all examples in a. leaf belong to the same class. However, since the decision tree creates a branch for each value of that appearing in the training data without considering whether the value is relevant to the classification, the resultant tree may have over-specialization problem. Without losing generality, we only consider ID3-like algorithm in this paper. As pointed out by J. Cheng, the irrelevant values problem and the missing branches problem are two causes of over-specialization of the decision tree. The missing branches problem of the decision tree is due to the fact that some of the reduced subsets at the non-leaf nodes do not necessarily contain examples of every possible value of the branching attribute. Consequently, the decision tree may fail to classify some instances. Since some values of that attribute may not be relevant to the classification, the resultant rules of the decision tree may have irrelevant conditions, which demands extra information to be supplied. Extra information needed means extra examinations needed to a patient, and extra examinations cause more expense and more burdens to the patient and society. When the decision tree is applied to medical applications, to save medical resources and avoid unnecessary examinations, we have to deal with irrelevant conditions in the decision tree. When a decision tree is represented by a collection of rules, the antecedents of individual rules may contain irrelevant conditions. When we apply these rules to medical examinations, these irrelevant conditions may cause unnecessary burden to the patient and the society. Therefore, to avoid generating rules with irrelevant conditions, we propose a new algorithm to remove irrelevant conditions of rules in the process of converting the decision tree to rules according to information on the decision tree. Our algorithm can handle not only discrete values, but also continuous values. Huan-Chao Keh 葛煥昭 2007 學位論文 ; thesis 103 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 淡江大學 === 資訊工程學系碩士班 === 95 === The decision tree is one of the key data mining techniques and has been applied to medical applications. A decision tree is built up by selecting the best test attribute as the root of the decision tree. Then, the same procedure is operated on each branch to induce the remaining levels of the decision tree until all examples in a. leaf belong to the same class. However, since the decision tree creates a branch for each value of that appearing in the training data without considering whether the value is relevant to the classification, the resultant tree may have over-specialization problem. Without losing generality, we only consider ID3-like algorithm in this paper.
As pointed out by J. Cheng, the irrelevant values problem and the missing branches problem are two causes of over-specialization of the decision tree. The missing branches problem of the decision tree is due to the fact that some of the reduced subsets at the non-leaf nodes do not necessarily contain examples of every possible value of the branching attribute. Consequently, the decision tree may fail to classify some instances. Since some values of that attribute may not be relevant to the classification, the resultant rules of the decision tree may have irrelevant conditions, which demands extra information to be supplied. Extra information needed means extra examinations needed to a patient, and extra examinations cause more expense and more burdens to the patient and society. When the decision tree is applied to medical applications, to save medical resources and avoid unnecessary examinations, we have to deal with irrelevant conditions in the decision tree.
When a decision tree is represented by a collection of rules, the antecedents of individual rules may contain irrelevant conditions. When we apply these rules to medical examinations, these irrelevant conditions may cause unnecessary burden to the patient and the society. Therefore, to avoid generating rules with irrelevant conditions, we propose a new algorithm to remove irrelevant conditions of rules in the process of converting the decision tree to rules according to information on the decision tree. Our algorithm can handle not only discrete values, but also continuous values.
|
author2 |
Huan-Chao Keh |
author_facet |
Huan-Chao Keh Nan-Ching Huang 黃南競 |
author |
Nan-Ching Huang 黃南競 |
spellingShingle |
Nan-Ching Huang 黃南競 THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS |
author_sort |
Nan-Ching Huang |
title |
THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS |
title_short |
THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS |
title_full |
THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS |
title_fullStr |
THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS |
title_full_unstemmed |
THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS |
title_sort |
irrelevant values problem in the decision tree for medical examinations |
publishDate |
2007 |
url |
http://ndltd.ncl.edu.tw/handle/38909883322526889002 |
work_keys_str_mv |
AT nanchinghuang theirrelevantvaluesprobleminthedecisiontreeformedicalexaminations AT huángnánjìng theirrelevantvaluesprobleminthedecisiontreeformedicalexaminations AT nanchinghuang juécèshùzhōngyíchúbùxiāngguānzhíwèntízàiyīliáoyánjiūdeyùnyòng AT huángnánjìng juécèshùzhōngyíchúbùxiāngguānzhíwèntízàiyīliáoyánjiūdeyùnyòng AT nanchinghuang irrelevantvaluesprobleminthedecisiontreeformedicalexaminations AT huángnánjìng irrelevantvaluesprobleminthedecisiontreeformedicalexaminations |
_version_ |
1717749081793626112 |