An Ensemble Approach to Predicting Health Outcomes
Heart disease and premature birth continue to be the leading cause of mortality and neonatal mortality in large parts of the world. They are also estimated to have the highest medical expenditures in the United States. Early detection of heart disease incidence plays a critical role in preserving he...
Other Authors: | |
---|---|
Format: | Others |
Language: | English English |
Published: |
Florida State University
|
Subjects: | |
Online Access: | http://purl.flvc.org/fsu/fd/FSU_migr_etd-7530 |
id |
ndltd-fsu.edu-oai-fsu.digital.flvc.org-fsu_183846 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-fsu.edu-oai-fsu.digital.flvc.org-fsu_1838462020-06-16T03:08:46Z An Ensemble Approach to Predicting Health Outcomes Nilles, Ester Kim (authoraut) McGee, Dan (professor directing dissertation) Zhang, Jinfeng (professor co-directing dissertation) Eberstein, Isaac (university representative) Sinha, Debajyoti (committee member) Department of Statistics (degree granting department) Florida State University (degree granting institution) Text text Florida State University Florida State University English eng 1 online resource computer application/pdf Heart disease and premature birth continue to be the leading cause of mortality and neonatal mortality in large parts of the world. They are also estimated to have the highest medical expenditures in the United States. Early detection of heart disease incidence plays a critical role in preserving heart health, and identifying pregnancies at high risk of premature birth is highly valuable information for early interventions. The past few decades, identification of patients at high health risk have been based on logistic regression or Cox proportional hazards models. In more recent years, machine learning models have grown in popularity within the medical field for their superior predictive and classification performances over the classical statistical models. However, their performances in heart disease and premature birth predictions have been comparable and inconclusive, leaving the question of which model most accurately reflects the data difficult to resolve. Our aim is to incorporate information learned by different models into one final model that will generate superior predictive performances. We first compare the widely used machine learning models - the multilayer perceptron network, k-nearest neighbor and support vector machine - to the statistical models logistic regression and Cox proportional hazards. Then the individual models are combined into one in an ensemble approach, also referred to as ensemble modeling. The proposed approaches include SSE-weighted, AUC-weighted, logistic and flexible naive Bayes. The individual models are unique and capture different aspects of the data, but as expected, no individual one outperforms any other. The ensemble approach is an easily computed method that eliminates the need to select one model, integrates the strengths of different models, and generates optimal performances. Particularly in cases where the risk factors associated to an outcome are elusive, such as in premature birth, the ensemble models significantly improve their prediction. A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Summer Semester, 2013. June 18, 2013. classification, coronary heart disease, ensemble modeling, machine learning, model selection, preterm birth Includes bibliographical references. Dan McGee, Professor Directing Dissertation; Jinfeng Zhang, Professor Co-Directing Dissertation; Isaac Eberstein, University Representative; Debajyoti Sinha, Committee Member. Statistics FSU_migr_etd-7530 http://purl.flvc.org/fsu/fd/FSU_migr_etd-7530 This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). The copyright in theses and dissertations completed at Florida State University is held by the students who author them. http://diginole.lib.fsu.edu/islandora/object/fsu%3A183846/datastream/TN/view/Ensemble%20Approach%20to%20Predicting%20Health%20Outcomes.jpg |
collection |
NDLTD |
language |
English English |
format |
Others
|
sources |
NDLTD |
topic |
Statistics |
spellingShingle |
Statistics An Ensemble Approach to Predicting Health Outcomes |
description |
Heart disease and premature birth continue to be the leading cause of mortality and neonatal mortality in large parts of the world. They are also estimated to have the highest medical expenditures in the United States. Early detection of heart disease incidence plays a critical role in preserving heart health, and identifying pregnancies at high risk of premature birth is highly valuable information for early interventions. The past few decades, identification of patients at high health risk have been based on logistic regression or Cox proportional hazards models. In more recent years, machine learning models have grown in popularity within the medical field for their superior predictive and classification performances over the classical statistical models. However, their performances in heart disease and premature birth predictions have been comparable and inconclusive, leaving the question of which model most accurately reflects the data difficult to resolve. Our aim is to incorporate information learned by different models into one final model that will generate superior predictive performances. We first compare the widely used machine learning models - the multilayer perceptron network, k-nearest neighbor and support vector machine - to the statistical models logistic regression and Cox proportional hazards. Then the individual models are combined into one in an ensemble approach, also referred to as ensemble modeling. The proposed approaches include SSE-weighted, AUC-weighted, logistic and flexible naive Bayes. The individual models are unique and capture different aspects of the data, but as expected, no individual one outperforms any other. The ensemble approach is an easily computed method that eliminates the need to select one model, integrates the strengths of different models, and generates optimal performances. Particularly in cases where the risk factors associated to an outcome are elusive, such as in premature birth, the ensemble models significantly improve their prediction. === A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. === Summer Semester, 2013. === June 18, 2013. === classification, coronary heart disease, ensemble modeling, machine learning,
model selection, preterm birth === Includes bibliographical references. === Dan McGee, Professor Directing Dissertation; Jinfeng Zhang, Professor Co-Directing Dissertation; Isaac Eberstein, University Representative; Debajyoti Sinha, Committee Member. |
author2 |
Nilles, Ester Kim (authoraut) |
author_facet |
Nilles, Ester Kim (authoraut) |
title |
An Ensemble Approach to Predicting Health Outcomes |
title_short |
An Ensemble Approach to Predicting Health Outcomes |
title_full |
An Ensemble Approach to Predicting Health Outcomes |
title_fullStr |
An Ensemble Approach to Predicting Health Outcomes |
title_full_unstemmed |
An Ensemble Approach to Predicting Health Outcomes |
title_sort |
ensemble approach to predicting health outcomes |
publisher |
Florida State University |
url |
http://purl.flvc.org/fsu/fd/FSU_migr_etd-7530 |
_version_ |
1719320092988145664 |