Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk

Abstract Background This paper explores the importance of electronic medical records (EMR) for predicting 30-day all-cause non-elective readmission risk of patients and presents a comparison of prediction performance of commonly used methods. Methods The data are extracted from eight Advocate Health...

Full description

Bibliographic Details
Main Authors:	Liping Tong, Cole Erdmann, Marina Daldalian, Jing Li, Tina Esposito
Format:	Article
Language:	English
Published:	BMC 2016-02-01
Series:	BMC Medical Research Methodology
Subjects:	Predictive Models Readmission Risk STEPWISE LASSO Ada Boost
Online Access:	http://link.springer.com/article/10.1186/s12874-016-0128-0

id	doaj-d020c16c03ba4fd3890374823ed18cac
record_format	Article
spelling	doaj-d020c16c03ba4fd3890374823ed18cac2020-11-24T23:54:10ZengBMCBMC Medical Research Methodology1471-22882016-02-011611810.1186/s12874-016-0128-0Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission riskLiping Tong0Cole Erdmann1Marina Daldalian2Jing Li3Tina Esposito4Advocate Health CareCerner Corporation, World HeadquartersCerner Corporation, World HeadquartersCerner Corporation, World HeadquartersAdvocate Health CareAbstract Background This paper explores the importance of electronic medical records (EMR) for predicting 30-day all-cause non-elective readmission risk of patients and presents a comparison of prediction performance of commonly used methods. Methods The data are extracted from eight Advocate Health Care hospitals. Index admissions are excluded from the cohort if they are observation, inpatient admissions for psychiatry, skilled nursing, hospice, rehabilitation, maternal and newborn visits, or if the patient expires during the index admission. Data are randomly and repeatedly divided into fitting and validating sets for cross validations. Approaches including LACE, STEPWISE logistic, LASSO logistic, and AdaBoost, are compared with sample sizes varying from 2,500 to 80,000. Results Our results confirm that LACE has moderate discrimination power with the area under receiver operating characteristic curve (AUC) around 0.65-0.66, which can be improved to 0.73-0.74 when additional variables from EMR are considered. These variables include Inpatient in the last six months, Number of emergency room visits or inpatients in the last year, Braden score, Polypharmacy, Employment status, Discharge disposition, Albumin level, and medical condition variables such as Leukemia, Malignancy, Renal failure with hemodialysis, History of alcohol substance abuse, Dementia and Trauma. When sample size is small (≤5000), LASSO is the best; when sample size is large (≥20,000), the predictive performance is similar. The STEPWISE method has a slightly lower AUC (0.734) comparing to LASSO (0.737) and AdaBoost (0.737). More than one half of the selected predictors can be false positives when using a single method and a single division of fitting/validating data. Conclusions True predictors can be identified by repeatedly dividing data into fitting/validating subsets and referring the final model based on summarizing results. LASSO is a better alternative to the STEPWISE logistic regression, especially when sample size is not large. The evidence for adequate sample size can be explored by fitting models on gradually reduced samples. Our model comparison strategy is not only good for 30-day all-cause non-elective readmission risk predictions, but also applicable to other types of predictive models in clinical studies.http://link.springer.com/article/10.1186/s12874-016-0128-0Predictive ModelsReadmission RiskSTEPWISELASSOAda Boost
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Liping Tong Cole Erdmann Marina Daldalian Jing Li Tina Esposito
spellingShingle	Liping Tong Cole Erdmann Marina Daldalian Jing Li Tina Esposito Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk BMC Medical Research Methodology Predictive Models Readmission Risk STEPWISE LASSO Ada Boost
author_facet	Liping Tong Cole Erdmann Marina Daldalian Jing Li Tina Esposito
author_sort	Liping Tong
title	Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title_short	Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title_full	Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title_fullStr	Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title_full_unstemmed	Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title_sort	comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
publisher	BMC
series	BMC Medical Research Methodology
issn	1471-2288
publishDate	2016-02-01
description	Abstract Background This paper explores the importance of electronic medical records (EMR) for predicting 30-day all-cause non-elective readmission risk of patients and presents a comparison of prediction performance of commonly used methods. Methods The data are extracted from eight Advocate Health Care hospitals. Index admissions are excluded from the cohort if they are observation, inpatient admissions for psychiatry, skilled nursing, hospice, rehabilitation, maternal and newborn visits, or if the patient expires during the index admission. Data are randomly and repeatedly divided into fitting and validating sets for cross validations. Approaches including LACE, STEPWISE logistic, LASSO logistic, and AdaBoost, are compared with sample sizes varying from 2,500 to 80,000. Results Our results confirm that LACE has moderate discrimination power with the area under receiver operating characteristic curve (AUC) around 0.65-0.66, which can be improved to 0.73-0.74 when additional variables from EMR are considered. These variables include Inpatient in the last six months, Number of emergency room visits or inpatients in the last year, Braden score, Polypharmacy, Employment status, Discharge disposition, Albumin level, and medical condition variables such as Leukemia, Malignancy, Renal failure with hemodialysis, History of alcohol substance abuse, Dementia and Trauma. When sample size is small (≤5000), LASSO is the best; when sample size is large (≥20,000), the predictive performance is similar. The STEPWISE method has a slightly lower AUC (0.734) comparing to LASSO (0.737) and AdaBoost (0.737). More than one half of the selected predictors can be false positives when using a single method and a single division of fitting/validating data. Conclusions True predictors can be identified by repeatedly dividing data into fitting/validating subsets and referring the final model based on summarizing results. LASSO is a better alternative to the STEPWISE logistic regression, especially when sample size is not large. The evidence for adequate sample size can be explored by fitting models on gradually reduced samples. Our model comparison strategy is not only good for 30-day all-cause non-elective readmission risk predictions, but also applicable to other types of predictive models in clinical studies.
topic	Predictive Models Readmission Risk STEPWISE LASSO Ada Boost
url	http://link.springer.com/article/10.1186/s12874-016-0128-0
work_keys_str_mv	AT lipingtong comparisonofpredictivemodelingapproachesfor30dayallcausenonelectivereadmissionrisk AT coleerdmann comparisonofpredictivemodelingapproachesfor30dayallcausenonelectivereadmissionrisk AT marinadaldalian comparisonofpredictivemodelingapproachesfor30dayallcausenonelectivereadmissionrisk AT jingli comparisonofpredictivemodelingapproachesfor30dayallcausenonelectivereadmissionrisk AT tinaesposito comparisonofpredictivemodelingapproachesfor30dayallcausenonelectivereadmissionrisk
_version_	1725466990897790976

Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk

Similar Items