THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS

碩士 === 淡江大學 === 資訊工程學系碩士班 === 95 === The decision tree is one of the key data mining techniques and has been applied to medical applications. A decision tree is built up by selecting the best test attribute as the root of the decision tree. Then, the same procedure is operated on each branch to indu...

Full description

Bibliographic Details
Main Authors: Nan-Ching Huang, 黃南競
Other Authors: Huan-Chao Keh
Format: Others
Language:zh-TW
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/38909883322526889002
id ndltd-TW-095TKU05392050
record_format oai_dc
spelling ndltd-TW-095TKU053920502015-10-13T14:08:17Z http://ndltd.ncl.edu.tw/handle/38909883322526889002 THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS 決策樹中移除不相關值問題在醫療研究的運用 Nan-Ching Huang 黃南競 碩士 淡江大學 資訊工程學系碩士班 95 The decision tree is one of the key data mining techniques and has been applied to medical applications. A decision tree is built up by selecting the best test attribute as the root of the decision tree. Then, the same procedure is operated on each branch to induce the remaining levels of the decision tree until all examples in a. leaf belong to the same class. However, since the decision tree creates a branch for each value of that appearing in the training data without considering whether the value is relevant to the classification, the resultant tree may have over-specialization problem. Without losing generality, we only consider ID3-like algorithm in this paper. As pointed out by J. Cheng, the irrelevant values problem and the missing branches problem are two causes of over-specialization of the decision tree. The missing branches problem of the decision tree is due to the fact that some of the reduced subsets at the non-leaf nodes do not necessarily contain examples of every possible value of the branching attribute. Consequently, the decision tree may fail to classify some instances. Since some values of that attribute may not be relevant to the classification, the resultant rules of the decision tree may have irrelevant conditions, which demands extra information to be supplied. Extra information needed means extra examinations needed to a patient, and extra examinations cause more expense and more burdens to the patient and society. When the decision tree is applied to medical applications, to save medical resources and avoid unnecessary examinations, we have to deal with irrelevant conditions in the decision tree. When a decision tree is represented by a collection of rules, the antecedents of individual rules may contain irrelevant conditions. When we apply these rules to medical examinations, these irrelevant conditions may cause unnecessary burden to the patient and the society. Therefore, to avoid generating rules with irrelevant conditions, we propose a new algorithm to remove irrelevant conditions of rules in the process of converting the decision tree to rules according to information on the decision tree. Our algorithm can handle not only discrete values, but also continuous values. Huan-Chao Keh 葛煥昭 2007 學位論文 ; thesis 103 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 淡江大學 === 資訊工程學系碩士班 === 95 === The decision tree is one of the key data mining techniques and has been applied to medical applications. A decision tree is built up by selecting the best test attribute as the root of the decision tree. Then, the same procedure is operated on each branch to induce the remaining levels of the decision tree until all examples in a. leaf belong to the same class. However, since the decision tree creates a branch for each value of that appearing in the training data without considering whether the value is relevant to the classification, the resultant tree may have over-specialization problem. Without losing generality, we only consider ID3-like algorithm in this paper. As pointed out by J. Cheng, the irrelevant values problem and the missing branches problem are two causes of over-specialization of the decision tree. The missing branches problem of the decision tree is due to the fact that some of the reduced subsets at the non-leaf nodes do not necessarily contain examples of every possible value of the branching attribute. Consequently, the decision tree may fail to classify some instances. Since some values of that attribute may not be relevant to the classification, the resultant rules of the decision tree may have irrelevant conditions, which demands extra information to be supplied. Extra information needed means extra examinations needed to a patient, and extra examinations cause more expense and more burdens to the patient and society. When the decision tree is applied to medical applications, to save medical resources and avoid unnecessary examinations, we have to deal with irrelevant conditions in the decision tree. When a decision tree is represented by a collection of rules, the antecedents of individual rules may contain irrelevant conditions. When we apply these rules to medical examinations, these irrelevant conditions may cause unnecessary burden to the patient and the society. Therefore, to avoid generating rules with irrelevant conditions, we propose a new algorithm to remove irrelevant conditions of rules in the process of converting the decision tree to rules according to information on the decision tree. Our algorithm can handle not only discrete values, but also continuous values.
author2 Huan-Chao Keh
author_facet Huan-Chao Keh
Nan-Ching Huang
黃南競
author Nan-Ching Huang
黃南競
spellingShingle Nan-Ching Huang
黃南競
THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS
author_sort Nan-Ching Huang
title THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS
title_short THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS
title_full THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS
title_fullStr THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS
title_full_unstemmed THE IRRELEVANT VALUES PROBLEM IN THE DECISION TREE FOR MEDICAL EXAMINATIONS
title_sort irrelevant values problem in the decision tree for medical examinations
publishDate 2007
url http://ndltd.ncl.edu.tw/handle/38909883322526889002
work_keys_str_mv AT nanchinghuang theirrelevantvaluesprobleminthedecisiontreeformedicalexaminations
AT huángnánjìng theirrelevantvaluesprobleminthedecisiontreeformedicalexaminations
AT nanchinghuang juécèshùzhōngyíchúbùxiāngguānzhíwèntízàiyīliáoyánjiūdeyùnyòng
AT huángnánjìng juécèshùzhōngyíchúbùxiāngguānzhíwèntízàiyīliáoyánjiūdeyùnyòng
AT nanchinghuang irrelevantvaluesprobleminthedecisiontreeformedicalexaminations
AT huángnánjìng irrelevantvaluesprobleminthedecisiontreeformedicalexaminations
_version_ 1717749081793626112