Machine learning and deep learning to predict mortality in patients with spontaneous coronary artery dissection

Abstract Machine learning (ML) and deep learning (DL) can successfully predict high prevalence events in very large databases (big data), but the value of this methodology for risk prediction in smaller cohorts with uncommon diseases and infrequent events is uncertain. The clinical course of spontan...

Full description

Bibliographic Details
Main Authors: Chayakrit Krittanawong, Hafeez Ul Hassan Virk, Anirudh Kumar, Mehmet Aydar, Zhen Wang, Matthew P. Stewart, Jonathan L. Halperin
Format: Article
Language:English
Published: Nature Publishing Group 2021-04-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-021-88172-0
Description
Summary:Abstract Machine learning (ML) and deep learning (DL) can successfully predict high prevalence events in very large databases (big data), but the value of this methodology for risk prediction in smaller cohorts with uncommon diseases and infrequent events is uncertain. The clinical course of spontaneous coronary artery dissection (SCAD) is variable, and no reliable methods are available to predict mortality. Based on the hypothesis that machine learning (ML) and deep learning (DL) techniques could enhance the identification of patients at risk, we applied a deep neural network to information available in electronic health records (EHR) to predict in-hospital mortality in patients with SCAD. We extracted patient data from the EHR of an extensive urban health system and applied several ML and DL models using candidate clinical variables potentially associated with mortality. We partitioned the data into training and evaluation sets with cross-validation. We estimated model performance based on the area under the receiver-operator characteristics curve (AUC) and balanced accuracy. As sensitivity analyses, we examined results limited to cases with complete clinical information available. We identified 375 SCAD patients of which mortality during the index hospitalization was 11.5%. The best-performing DL algorithm identified in-hospital mortality with AUC 0.98 (95% CI 0.97–0.99), compared to other ML models (P < 0.0001). For prediction of mortality using ML models in patients with SCAD, the AUC ranged from 0.50 with the random forest method (95% CI 0.41–0.58) to 0.95 with the AdaBoost model (95% CI 0.93–0.96), with intermediate performance using logistic regression, decision tree, support vector machine, K-nearest neighbors, and extreme gradient boosting methods. A deep neural network model was associated with higher predictive accuracy and discriminative power than logistic regression or ML models for identification of patients with ACS due to SCAD prone to early mortality.
ISSN:2045-2322