Apply the Health Examination Data to Construct Colorectal Cancer Prediction Models

碩士 === 朝陽科技大學 === 工業工程與管理系碩士班 === 99 === In recent years, with the rapid change of the society, for the sake of convenience and easy, people started to take high fat and low fiber food. However, excessive intake can cause colon mucosa and have stimulating effect, which will stimulate the digestive s...

Full description

Bibliographic Details
Main Authors: Yi-Siang Lin, 林義祥
Other Authors: Chun-Yuan Cheng
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/71261931830059249448
Description
Summary:碩士 === 朝陽科技大學 === 工業工程與管理系碩士班 === 99 === In recent years, with the rapid change of the society, for the sake of convenience and easy, people started to take high fat and low fiber food. However, excessive intake can cause colon mucosa and have stimulating effect, which will stimulate the digestive system and tends to block the large intestine. According to DOH statistics results, showed that number of people getting colon cancer (also known as colorectal cancer) tend to increasing over the years. From 2462 of year 1996 to 4531 of year 2009.Therefore, the impact of colorectal cancer is in negligible. This study uses Data mining techniques, taking a medical center’s general health check information for sample. Our goal is to explore the correlations between physical examination data and disease associated with colorectal cancer. Also, we build a colorectal cancer predictive model. The construction of predictive model is divided into two stages, (1) by using difference and Logistic regression analysis methods; we sift out the important risk factors for the colon cancer from the health check data. (2) we set the important risk factors acquired from stage one as independent variables, and apply the neural networks and support vector machine to construct the colorectal cancer prediction model. The results show that, the model built with correlation Logistic Regression combined with Support Vector Machines prediction is more accurate, the average mean accuracy is 88.60%, and sensitivity and specificity were 87.32% and 75.76%. However, the model built with Discriminant Analysis combined with the best Support Vector Machines is less affected by the ratio of the normal and abnormal data in the sample, the mean accuracy of 77.45%, sensitivity and specificity were76.53% and 76.50%. Keywords:Physical examination, Discriminant Analysis, Logistic Regression, Artificial Neural Networks, Support Vector Machines, Colorectal cancer