Summary: | The value of machine learning in healthcare comes from its ability to process large amount of health care data to extract clinical insights that are helpful to physicians for planning and providing care with better outcomes and lower costs. Recent studies exploring machine learning techniques suggest that predictive models have the potential for identifying high risk patients, however, the advantage of machine learning methods over classical methods neither evident nor
universal. Moreover, only a few studies address the challenges posed by class-imbalanced data commonly encountered in healthcare applications. In this work, we compared different machine learning algorithms to predict all-cause readmissions 30 days after discharge with heart failure hospitalization. In this research we addressed the feature selection and the class imbalance issues in the healthcare data. We developed various machine learning models and studied their performance. The
models explored include logistic regression, decision tree, random forest, Naïve Bayes, support vector machine, and X-boost. We compared their performance using the performance metrics such as area under the receiver operating characteristic curve (AUC) and sensitivity and specificity. We identified 5894 patients admitted with heart failure complications between 2011 and 2015. The dataset included 8684 records and 61 variables. Among the study patients, 16.44% were readmitted within 30
days of hospital discharge. This research explored the effectiveness of different class balancing and feature selection approaches. The models produced AUCs in the range of 0.62 - 0.79 and a sensitivity in the range of 0.25 - 0.73. On the current dataset, machine learning techniques did not outperform the standard regression model to predict 30- day readmission for heart failure patients. However, the result achieved by all the classifier agree with the results reported in the
literature.
|