Development of An Individualized Risk Prediction Model for COVID-19 Using Electronic Health Record Data

Developing an accurate and interpretable model to predict an individual’s risk for Coronavirus Disease 2019 (COVID-19) is a critical step to efficiently triage testing and other scarce preventative resources. To aid in this effort, we have developed an interpretable risk calculator that utilized de-...

Full description

Bibliographic Details
Main Authors: Tarun Karthik Kumar Mamidi, Thi K. Tran-Nguyen, Ryan L. Melvin, Elizabeth A. Worthey
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-06-01
Series:Frontiers in Big Data
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fdata.2021.675882/full
id doaj-d939237654584a02ba0dcf3529dde537
record_format Article
spelling doaj-d939237654584a02ba0dcf3529dde5372021-06-04T04:57:03ZengFrontiers Media S.A.Frontiers in Big Data2624-909X2021-06-01410.3389/fdata.2021.675882675882Development of An Individualized Risk Prediction Model for COVID-19 Using Electronic Health Record DataTarun Karthik Kumar Mamidi0Thi K. Tran-Nguyen1Ryan L. Melvin2Elizabeth A. Worthey3Elizabeth A. Worthey4Center for Computational Genomics and Data Science, Departments of Pediatrics and Pathology, University of Alabama at Birmingham School of Medicine, Birmingham, AL, United StatesHugh Kaul Precision Medicine Institute, University of Alabama at Birmingham, Birmingham, AL, United StatesDepartment of Anesthesiology and Perioperative Medicine, University of Alabama at Birmingham, Birmingham, AL, United StatesCenter for Computational Genomics and Data Science, Departments of Pediatrics and Pathology, University of Alabama at Birmingham School of Medicine, Birmingham, AL, United StatesHugh Kaul Precision Medicine Institute, University of Alabama at Birmingham, Birmingham, AL, United StatesDeveloping an accurate and interpretable model to predict an individual’s risk for Coronavirus Disease 2019 (COVID-19) is a critical step to efficiently triage testing and other scarce preventative resources. To aid in this effort, we have developed an interpretable risk calculator that utilized de-identified electronic health records (EHR) from the University of Alabama at Birmingham Informatics for Integrating Biology and the Bedside (UAB-i2b2) COVID-19 repository under the U-BRITE framework. The generated risk scores are analogous to commonly used credit scores where higher scores indicate higher risks for COVID-19 infection. By design, these risk scores can easily be calculated in spreadsheets or even with pen and paper. To predict risk, we implemented a Credit Scorecard modeling approach on longitudinal EHR data from 7,262 patients enrolled in the UAB Health System who were evaluated and/or tested for COVID-19 between January and June 2020. In this cohort, 912 patients were positive for COVID-19. Our workflow considered the timing of symptoms and medical conditions and tested the effects by applying different variable selection techniques such as LASSO and Elastic-Net. Within the two weeks before a COVID-19 diagnosis, the most predictive features were respiratory symptoms such as cough, abnormalities of breathing, pain in the throat and chest as well as other chronic conditions including nicotine dependence and major depressive disorder. When extending the timeframe to include all medical conditions across all time, our models also uncovered several chronic conditions impacting the respiratory, cardiovascular, central nervous and urinary organ systems. The whole pipeline of data processing, risk modeling and web-based risk calculator can be applied to any EHR data following the OMOP common data format. The results can be employed to generate questionnaires to estimate COVID-19 risk for screening in building entries or to optimize hospital resources.https://www.frontiersin.org/articles/10.3389/fdata.2021.675882/fullCOVID-19electronic health recordrisk predictionICD-10credit scorecard model
collection DOAJ
language English
format Article
sources DOAJ
author Tarun Karthik Kumar Mamidi
Thi K. Tran-Nguyen
Ryan L. Melvin
Elizabeth A. Worthey
Elizabeth A. Worthey
spellingShingle Tarun Karthik Kumar Mamidi
Thi K. Tran-Nguyen
Ryan L. Melvin
Elizabeth A. Worthey
Elizabeth A. Worthey
Development of An Individualized Risk Prediction Model for COVID-19 Using Electronic Health Record Data
Frontiers in Big Data
COVID-19
electronic health record
risk prediction
ICD-10
credit scorecard model
author_facet Tarun Karthik Kumar Mamidi
Thi K. Tran-Nguyen
Ryan L. Melvin
Elizabeth A. Worthey
Elizabeth A. Worthey
author_sort Tarun Karthik Kumar Mamidi
title Development of An Individualized Risk Prediction Model for COVID-19 Using Electronic Health Record Data
title_short Development of An Individualized Risk Prediction Model for COVID-19 Using Electronic Health Record Data
title_full Development of An Individualized Risk Prediction Model for COVID-19 Using Electronic Health Record Data
title_fullStr Development of An Individualized Risk Prediction Model for COVID-19 Using Electronic Health Record Data
title_full_unstemmed Development of An Individualized Risk Prediction Model for COVID-19 Using Electronic Health Record Data
title_sort development of an individualized risk prediction model for covid-19 using electronic health record data
publisher Frontiers Media S.A.
series Frontiers in Big Data
issn 2624-909X
publishDate 2021-06-01
description Developing an accurate and interpretable model to predict an individual’s risk for Coronavirus Disease 2019 (COVID-19) is a critical step to efficiently triage testing and other scarce preventative resources. To aid in this effort, we have developed an interpretable risk calculator that utilized de-identified electronic health records (EHR) from the University of Alabama at Birmingham Informatics for Integrating Biology and the Bedside (UAB-i2b2) COVID-19 repository under the U-BRITE framework. The generated risk scores are analogous to commonly used credit scores where higher scores indicate higher risks for COVID-19 infection. By design, these risk scores can easily be calculated in spreadsheets or even with pen and paper. To predict risk, we implemented a Credit Scorecard modeling approach on longitudinal EHR data from 7,262 patients enrolled in the UAB Health System who were evaluated and/or tested for COVID-19 between January and June 2020. In this cohort, 912 patients were positive for COVID-19. Our workflow considered the timing of symptoms and medical conditions and tested the effects by applying different variable selection techniques such as LASSO and Elastic-Net. Within the two weeks before a COVID-19 diagnosis, the most predictive features were respiratory symptoms such as cough, abnormalities of breathing, pain in the throat and chest as well as other chronic conditions including nicotine dependence and major depressive disorder. When extending the timeframe to include all medical conditions across all time, our models also uncovered several chronic conditions impacting the respiratory, cardiovascular, central nervous and urinary organ systems. The whole pipeline of data processing, risk modeling and web-based risk calculator can be applied to any EHR data following the OMOP common data format. The results can be employed to generate questionnaires to estimate COVID-19 risk for screening in building entries or to optimize hospital resources.
topic COVID-19
electronic health record
risk prediction
ICD-10
credit scorecard model
url https://www.frontiersin.org/articles/10.3389/fdata.2021.675882/full
work_keys_str_mv AT tarunkarthikkumarmamidi developmentofanindividualizedriskpredictionmodelforcovid19usingelectronichealthrecorddata
AT thiktrannguyen developmentofanindividualizedriskpredictionmodelforcovid19usingelectronichealthrecorddata
AT ryanlmelvin developmentofanindividualizedriskpredictionmodelforcovid19usingelectronichealthrecorddata
AT elizabethaworthey developmentofanindividualizedriskpredictionmodelforcovid19usingelectronichealthrecorddata
AT elizabethaworthey developmentofanindividualizedriskpredictionmodelforcovid19usingelectronichealthrecorddata
_version_ 1721398507496013824