Applying Data Mining to Cancer Risk in Patients with Pyogenic Liver Abscess

碩士 === 東海大學 === 工業工程與經營資訊學系 === 103 === Cancer is the first leading cause of death in Taiwan. Among top three main cancers, the incidence of colorectal cancer (CRC) and hepatocellular carcinoma (HCC) in Taiwan are much higher than in western countries. In recent years, there are few studies finding...

Full description

Bibliographic Details
Main Authors: Tsung-Yu Lee, 李宗祐
Other Authors: Jau-Shin Hon
Format: Others
Language:zh-TW
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/85830557131808502987
Description
Summary:碩士 === 東海大學 === 工業工程與經營資訊學系 === 103 === Cancer is the first leading cause of death in Taiwan. Among top three main cancers, the incidence of colorectal cancer (CRC) and hepatocellular carcinoma (HCC) in Taiwan are much higher than in western countries. In recent years, there are few studies finding that the patients with pyogenic liver abscess (PLA) had higher risk of developing colorectal cancer and hepatocellular carcinoma. Therefore, I would like to confirm whether the patients with PLA had higher cancer risk of CRC and HCC or not. Then find out the rules of developing cancers in these patients to assist in the diagnosis. The subjects of this study were the lab data of the patients with PLA in Taichung Veterans General Hospital. To understand the cancer risk of patients with pyogenic liver abscess, the odds ratio was determined by binary logistic regression. And building the cancer classification model by decision tree-C5.0 to compare the model accuracy between “ with the variable- PLA” and “without the variable- PLA “. And determineing the optimal model by model accuracy. After that, summarize the rules by the optimal model to assist in the diagnosis. The results showed that the risk of developing CRC and HCC was higher among patients with PLA compared with other control [HCC (OR, 3.751;95%CI,1.149-12.253), CRC(OR,6.838;95%CI, 2.679-17.455)]. In terms of decision tree analysis, it showed that the model with the variable ‘PLA’ has higher model accuracy, and it can be classified by less factors. It indirectly showed the importance of PLA. Finally, in the optimal model, we find out two rules of CRC and HCC, respectively, to assist in the diagnosis.