External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research

Kueiyu Joshua Lin,1,2 Gary E Rosenthal,3 Shawn N Murphy,4,5 Kenneth D Mandl,6 Yinzhu Jin,1 Robert J Glynn,1 Sebastian Schneeweiss1 1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA; 2Depart...

Full description

Bibliographic Details
Main Authors:	Lin KJ, Rosenthal GE, Murphy SN, Mandl KD, Jin Y, Glynn RJ, Schneeweiss S
Format:	Article
Language:	English
Published:	Dove Medical Press 2020-02-01
Series:	Clinical Epidemiology
Subjects:	electronic medical records data linkage comparative effectiveness research information bias continuity external validation
Online Access:	https://www.dovepress.com/external-validation-of-an-algorithm-to-identify-patients-with-high-dat-peer-reviewed-article-CLEP

id	doaj-e6cde278fe7d4c01bb71b9d076901f87
record_format	Article
spelling	doaj-e6cde278fe7d4c01bb71b9d076901f872020-11-25T01:58:26ZengDove Medical PressClinical Epidemiology1179-13492020-02-01Volume 1213314151555External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness ResearchLin KJRosenthal GEMurphy SNMandl KDJin YGlynn RJSchneeweiss SKueiyu Joshua Lin,1,2 Gary E Rosenthal,3 Shawn N Murphy,4,5 Kenneth D Mandl,6 Yinzhu Jin,1 Robert J Glynn,1 Sebastian Schneeweiss1 1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA; 2Department of Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; 3Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA; 4Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; 5Research Information Science and Computing, Partners Healthcare, Somerville, MA, USA; 6Computational Health Informatics Program, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USACorrespondence: Kueiyu Joshua LinDivision of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, 1620 Tremont St. Suite 3030, Boston, MA 02120, USATel +1 617 278-0930Fax +1 617 232-8602Email jklin@mgh.harvard.eduObjective: Electronic health records (EHR) data-discontinuity, i.e. receiving care outside of a particular EHR system, may cause misclassification of study variables. We aimed to validate an algorithm to identify patients with high EHR data-continuity to reduce such bias.Materials and Methods: We analyzed data from two EHR systems linked with Medicare claims data from 2007 through 2014, one in Massachusetts (MA, n=80,588) and the other in North Carolina (NC, n=33,207). We quantified EHR data-continuity by Mean Proportion of Encounters Captured (MPEC) by the EHR system when compared to complete recording in claims data. The prediction model for MPEC was developed in MA and validated in NC. Stratified by predicted EHR data-continuity, we quantified misclassification of 40 key variables by Mean Standardized Differences (MSD) between the proportions of these variables based on EHR alone vs the linked claims-EHR data.Results: The mean MPEC was 27% in the MA and 26% in the NC system. The predicted and observed EHR data-continuity was highly correlated (Spearman correlation=0.78 and 0.73, respectively). The misclassification (MSD) of 40 variables in patients of the predicted EHR data-continuity cohort was significantly smaller (44%, 95% CI: 40– 48%) than that in the remaining population.Discussion: The comorbidity profiles were similar in patients with high vs low EHR data-continuity. Therefore, restricting an analysis to patients with high EHR data-continuity may reduce information bias while preserving the representativeness of the study cohort.Conclusion: We have successfully validated an algorithm that can identify a high EHR data-continuity cohort representative of the source population.Keywords: electronic medical records, data linkage, comparative effectiveness research, information bias, continuity, external validationhttps://www.dovepress.com/external-validation-of-an-algorithm-to-identify-patients-with-high-dat-peer-reviewed-article-CLEPelectronic medical recordsdata linkagecomparative effectiveness researchinformation biascontinuityexternal validation
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Lin KJ Rosenthal GE Murphy SN Mandl KD Jin Y Glynn RJ Schneeweiss S
spellingShingle	Lin KJ Rosenthal GE Murphy SN Mandl KD Jin Y Glynn RJ Schneeweiss S External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research Clinical Epidemiology electronic medical records data linkage comparative effectiveness research information bias continuity external validation
author_facet	Lin KJ Rosenthal GE Murphy SN Mandl KD Jin Y Glynn RJ Schneeweiss S
author_sort	Lin KJ
title	External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title_short	External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title_full	External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title_fullStr	External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title_full_unstemmed	External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title_sort	external validation of an algorithm to identify patients with high data-completeness in electronic health records for comparative effectiveness research
publisher	Dove Medical Press
series	Clinical Epidemiology
issn	1179-1349
publishDate	2020-02-01
description	Kueiyu Joshua Lin,1,2 Gary E Rosenthal,3 Shawn N Murphy,4,5 Kenneth D Mandl,6 Yinzhu Jin,1 Robert J Glynn,1 Sebastian Schneeweiss1 1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA; 2Department of Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; 3Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA; 4Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; 5Research Information Science and Computing, Partners Healthcare, Somerville, MA, USA; 6Computational Health Informatics Program, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USACorrespondence: Kueiyu Joshua LinDivision of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, 1620 Tremont St. Suite 3030, Boston, MA 02120, USATel +1 617 278-0930Fax +1 617 232-8602Email jklin@mgh.harvard.eduObjective: Electronic health records (EHR) data-discontinuity, i.e. receiving care outside of a particular EHR system, may cause misclassification of study variables. We aimed to validate an algorithm to identify patients with high EHR data-continuity to reduce such bias.Materials and Methods: We analyzed data from two EHR systems linked with Medicare claims data from 2007 through 2014, one in Massachusetts (MA, n=80,588) and the other in North Carolina (NC, n=33,207). We quantified EHR data-continuity by Mean Proportion of Encounters Captured (MPEC) by the EHR system when compared to complete recording in claims data. The prediction model for MPEC was developed in MA and validated in NC. Stratified by predicted EHR data-continuity, we quantified misclassification of 40 key variables by Mean Standardized Differences (MSD) between the proportions of these variables based on EHR alone vs the linked claims-EHR data.Results: The mean MPEC was 27% in the MA and 26% in the NC system. The predicted and observed EHR data-continuity was highly correlated (Spearman correlation=0.78 and 0.73, respectively). The misclassification (MSD) of 40 variables in patients of the predicted EHR data-continuity cohort was significantly smaller (44%, 95% CI: 40– 48%) than that in the remaining population.Discussion: The comorbidity profiles were similar in patients with high vs low EHR data-continuity. Therefore, restricting an analysis to patients with high EHR data-continuity may reduce information bias while preserving the representativeness of the study cohort.Conclusion: We have successfully validated an algorithm that can identify a high EHR data-continuity cohort representative of the source population.Keywords: electronic medical records, data linkage, comparative effectiveness research, information bias, continuity, external validation
topic	electronic medical records data linkage comparative effectiveness research information bias continuity external validation
url	https://www.dovepress.com/external-validation-of-an-algorithm-to-identify-patients-with-high-dat-peer-reviewed-article-CLEP
work_keys_str_mv	AT linkj externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch AT rosenthalge externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch AT murphysn externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch AT mandlkd externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch AT jiny externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch AT glynnrj externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch AT schneeweisss externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch
_version_	1724969692625371136

External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research

Similar Items