Integrating Data Mining and Kernel Density Estimation to Assess Common Physiological Indicators of Multiple Diseases

博士 === 元智大學 === 工業工程與管理學系 === 100 === Certain chronic diseases become the major causes of death with changes in lifestyle; however, the initial symptoms of these chronic diseases are usually not obvious and mutually induced with other diseases. Therefore, their prevention and treatment are difficult...

Full description

Bibliographic Details
Main Authors: Cheng-Ting Chang, 張正鼎
Other Authors: BernardC.Jiang
Format: Others
Language:zh-TW
Online Access:http://ndltd.ncl.edu.tw/handle/78029885575985648417
id ndltd-TW-100YZU05031032
record_format oai_dc
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 博士 === 元智大學 === 工業工程與管理學系 === 100 === Certain chronic diseases become the major causes of death with changes in lifestyle; however, the initial symptoms of these chronic diseases are usually not obvious and mutually induced with other diseases. Therefore, their prevention and treatment are difficult. Many previous studies have employed predictive models for a specific disease. However, these studies fail to note that some associated multiple diseases might have reciprocal effects, and abnormalities in physiological indicators can indicate multiple associated diseases. In addition, risk of failure is commonly used, and the assessment of physiological systems in human body by using this concept is an interesting issue. In this study, we developed an analysis process by selecting common physiological indicators of multiple diseases and constructing a predictive model for multiple physiological conditions. Moreover, the values of the common physiological indicators were varied with different physiological states in order to construct a risk model of diseases for physiological systems.  Various data mining technologies were used in multiple classifier systems to extract common physiological indicators of multiple diseases by the major voting method in the first part of the analysis process. The second part focused on constructing predictive models for multiple diseases by using common physiological indicators that serve as predictors. Kernel density estimation applied to fit the distribution of each common physiological indicator for probability estimation belongs to the health condition. In addition, reliability of the physiological system, defined as the product of all probabilities, belongs to the health condition in all common physiological indicators. In this study, three cases were used to explain the analysis process. In each case, six data mining technologies including logistic regression, decision trees, and discriminant analysis were first combined to select the common physiological indicators of multiple diseases and then applied to multivariate adaptive regression splines (MARS) and artificial neural networks (ANN) to construct a predictive model for multiple diseases. In the UCI heart diseases dataset, thalassemia (thal), number of major vessels colored by fluoroscopy (ca), chest pain types (cp), and exercise-induced angina (exang) were the common physiological indicators of heart diseases. The highest predictive accuracy rate of multilayer perceptron neural network (MLPNN) achieved by these indicators was found to be 67.16%. The second dataset includes the survey on prevalence of high blood sugar level, hyperlipidemia, and hypertension in Taiwan; it received a grant from Bureau of Health Promotion, Department of Health, in Taiwan. The common physiological indicators of these three diseases were fasting plasma glucose (FPG), total cholesterol (T-CHO), triglyceride (TG), systolic blood pressure (SBP), and diastolic blood pressure (DBP). These common physiological indicators, which are not only consistent with the clinical guidelines but also used in MLPNN for constructing the predictive model, can achieve the accuracy rate of 98.91%. This study simulated the result of Stewart et al. (2005) and estimated that the probability of suffering from hypertension, hyperlipidemia, or high blood sugar would reduce to 7.29% after completing a 6-week exercising period of 3 days a week. The third dataset used in this research was received from a health inspection center in a hospital. According to the analysis process, the common physiological indicators of hypertension and hyperlipidemia were sex, total cholesterol, SBP, and DBP. By using either MARS or ANN technology to construct the predictive models, all models can achieve over 92% accuracy rate. Moreover, the reliability and maintainability of physiological systems can be quantified by using the kernel density estimation approach for analyzing the distribution of any physiological condition of the aforementioned four common physiological indicators. This study simulated the finding of Lewis et al. (1976) and estimated that the probability of suffering from hypertension or hyperlipidemia would reduce to 4.63% after a 17-week exercising period. The analysis process proposed in this study enables the selection of the common physiological indicators of multiple diseases and construction of a predictive model for these diseases. Moreover, the reliability of physiological systems and the effect of maintenance activities on human health can be quantified.
author2 BernardC.Jiang
author_facet BernardC.Jiang
Cheng-Ting Chang
張正鼎
author Cheng-Ting Chang
張正鼎
spellingShingle Cheng-Ting Chang
張正鼎
Integrating Data Mining and Kernel Density Estimation to Assess Common Physiological Indicators of Multiple Diseases
author_sort Cheng-Ting Chang
title Integrating Data Mining and Kernel Density Estimation to Assess Common Physiological Indicators of Multiple Diseases
title_short Integrating Data Mining and Kernel Density Estimation to Assess Common Physiological Indicators of Multiple Diseases
title_full Integrating Data Mining and Kernel Density Estimation to Assess Common Physiological Indicators of Multiple Diseases
title_fullStr Integrating Data Mining and Kernel Density Estimation to Assess Common Physiological Indicators of Multiple Diseases
title_full_unstemmed Integrating Data Mining and Kernel Density Estimation to Assess Common Physiological Indicators of Multiple Diseases
title_sort integrating data mining and kernel density estimation to assess common physiological indicators of multiple diseases
url http://ndltd.ncl.edu.tw/handle/78029885575985648417
work_keys_str_mv AT chengtingchang integratingdataminingandkerneldensityestimationtoassesscommonphysiologicalindicatorsofmultiplediseases
AT zhāngzhèngdǐng integratingdataminingandkerneldensityestimationtoassesscommonphysiologicalindicatorsofmultiplediseases
AT chengtingchang zhěnghézīliàotànkānyǔhémìdùgūjìjìshùyúréntǐduōzhòngjíbìnggòngtóngshēnglǐzhǐbiāozhīpínggū
AT zhāngzhèngdǐng zhěnghézīliàotànkānyǔhémìdùgūjìjìshùyúréntǐduōzhòngjíbìnggòngtóngshēnglǐzhǐbiāozhīpínggū
_version_ 1718066138554826752
spelling ndltd-TW-100YZU050310322015-10-13T21:33:09Z http://ndltd.ncl.edu.tw/handle/78029885575985648417 Integrating Data Mining and Kernel Density Estimation to Assess Common Physiological Indicators of Multiple Diseases 整合資料探勘與核密度估計技術於人體多重疾病共同生理指標之評估 Cheng-Ting Chang 張正鼎 博士 元智大學 工業工程與管理學系 100 Certain chronic diseases become the major causes of death with changes in lifestyle; however, the initial symptoms of these chronic diseases are usually not obvious and mutually induced with other diseases. Therefore, their prevention and treatment are difficult. Many previous studies have employed predictive models for a specific disease. However, these studies fail to note that some associated multiple diseases might have reciprocal effects, and abnormalities in physiological indicators can indicate multiple associated diseases. In addition, risk of failure is commonly used, and the assessment of physiological systems in human body by using this concept is an interesting issue. In this study, we developed an analysis process by selecting common physiological indicators of multiple diseases and constructing a predictive model for multiple physiological conditions. Moreover, the values of the common physiological indicators were varied with different physiological states in order to construct a risk model of diseases for physiological systems.  Various data mining technologies were used in multiple classifier systems to extract common physiological indicators of multiple diseases by the major voting method in the first part of the analysis process. The second part focused on constructing predictive models for multiple diseases by using common physiological indicators that serve as predictors. Kernel density estimation applied to fit the distribution of each common physiological indicator for probability estimation belongs to the health condition. In addition, reliability of the physiological system, defined as the product of all probabilities, belongs to the health condition in all common physiological indicators. In this study, three cases were used to explain the analysis process. In each case, six data mining technologies including logistic regression, decision trees, and discriminant analysis were first combined to select the common physiological indicators of multiple diseases and then applied to multivariate adaptive regression splines (MARS) and artificial neural networks (ANN) to construct a predictive model for multiple diseases. In the UCI heart diseases dataset, thalassemia (thal), number of major vessels colored by fluoroscopy (ca), chest pain types (cp), and exercise-induced angina (exang) were the common physiological indicators of heart diseases. The highest predictive accuracy rate of multilayer perceptron neural network (MLPNN) achieved by these indicators was found to be 67.16%. The second dataset includes the survey on prevalence of high blood sugar level, hyperlipidemia, and hypertension in Taiwan; it received a grant from Bureau of Health Promotion, Department of Health, in Taiwan. The common physiological indicators of these three diseases were fasting plasma glucose (FPG), total cholesterol (T-CHO), triglyceride (TG), systolic blood pressure (SBP), and diastolic blood pressure (DBP). These common physiological indicators, which are not only consistent with the clinical guidelines but also used in MLPNN for constructing the predictive model, can achieve the accuracy rate of 98.91%. This study simulated the result of Stewart et al. (2005) and estimated that the probability of suffering from hypertension, hyperlipidemia, or high blood sugar would reduce to 7.29% after completing a 6-week exercising period of 3 days a week. The third dataset used in this research was received from a health inspection center in a hospital. According to the analysis process, the common physiological indicators of hypertension and hyperlipidemia were sex, total cholesterol, SBP, and DBP. By using either MARS or ANN technology to construct the predictive models, all models can achieve over 92% accuracy rate. Moreover, the reliability and maintainability of physiological systems can be quantified by using the kernel density estimation approach for analyzing the distribution of any physiological condition of the aforementioned four common physiological indicators. This study simulated the finding of Lewis et al. (1976) and estimated that the probability of suffering from hypertension or hyperlipidemia would reduce to 4.63% after a 17-week exercising period. The analysis process proposed in this study enables the selection of the common physiological indicators of multiple diseases and construction of a predictive model for these diseases. Moreover, the reliability of physiological systems and the effect of maintenance activities on human health can be quantified. BernardC.Jiang 江行全 學位論文 ; thesis 127 zh-TW