Medical Diagnosis Knowledge Extraction using Feature Selection:An Example of Coronary Heart Disease Diagnosis Knowledge Extraction

碩士 === 南台科技大學 === 企業管理系 === 95 === Because of the innovation and development of computer technology, researchers attempt to build an expert system to help human to solve the problems like an expert. However, it is difficult to unify and to organize the experts’ knowledge to store the knowledge in an...

Full description

Bibliographic Details
Main Authors: Yu-Hsien Wang, 王渝嫻
Other Authors: Tsang-Hsiang Cheng
Format: Others
Language:zh-TW
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/22322279001066230111
Description
Summary:碩士 === 南台科技大學 === 企業管理系 === 95 === Because of the innovation and development of computer technology, researchers attempt to build an expert system to help human to solve the problems like an expert. However, it is difficult to unify and to organize the experts’ knowledge to store the knowledge in an information system. In the current medical environment, physicians are asked to increase the quality of medical care while reducing the medical cost. In order to help physicians to make good diagnosis efficiently based on patients’ history and lab examination, this study evaluates feature selection methods, including feature selection by expertise and automatic feature selection, for the diagnosis of coronary artery heard disease. The evaluation results may help the medical knowledge extraction from historical examples. In this study, we employ a coronary artery heard disease dataset in the UCI repository as for our evaluation. This dataset includes 920 patient records. We invited three cardiac physicians to participate in this study for identifying important diagnosis factors about the coronary artery heard disease. We first used Delphi-like method to identify the important diagnosis factors with helps of the three physicians. Then, we also used an automatic feature selection mechanism, CFS, to select important factors. Finally, we built the C4.5 decision-tree models based the feature sets. Base on performance evaluation results of C4.5 models, we find that we can first employ the automatic feature selection method to quickly select important features. Then, we revise the feature set according to the experts’ opinions. The C4.5 models based on the revised feature set will have good disease classification performance. At the same time, the readability and explanation of the prognosis models will also increase. This approach also simplifies the diagnosis knowledge extraction work.