Concept Libraries for Automatic Electronic Health Record Based Phenotyping: A Review

Introduction Electronic health records (EHR) are linked together to examine disease history and to undertake research into the causes and outcomes of disease. However, the process of constructing algorithms for phenotyping (e.g., identifying disease characteristics) or health characteristics (e.g.,...

Full description

Bibliographic Details
Main Authors: Zahra Ahmed Almowil, Shang-Ming Zhou, Sinead Brophy
Format: Article
Language:English
Published: Swansea University 2021-06-01
Series:International Journal of Population Data Science
Online Access:https://ijpds.org/article/view/1362
id doaj-ed61ade0a19d42008607db57c769294c
record_format Article
spelling doaj-ed61ade0a19d42008607db57c769294c2021-06-19T17:29:36ZengSwansea UniversityInternational Journal of Population Data Science2399-49082021-06-016110.23889/ijpds.v6i1.1362Concept Libraries for Automatic Electronic Health Record Based Phenotyping: A ReviewZahra Ahmed Almowil0Shang-Ming Zhou1Sinead Brophy2Swansea University Medical School, Wales SA2 8PPCentre for Health Technology, Faculty of Health, University of Plymouth, Plymouth, PL4 8AA, UKSwansea University Medical School, Wales SA2 8PP Introduction Electronic health records (EHR) are linked together to examine disease history and to undertake research into the causes and outcomes of disease. However, the process of constructing algorithms for phenotyping (e.g., identifying disease characteristics) or health characteristics (e.g., smoker) is very time consuming and resource costly. In addition, results can vary greatly between researchers. Reusing or building on algorithms that others have created is a compelling solution to these problems. However, sharing algorithms is not a common practice and many published studies do not detail the clinical code lists used by the researchers in the disease/characteristic definition. To address these challenges, a number of centres across the world have developed health data portals which contain concept libraries (e.g., algorithms for defining concepts such as disease and characteristics) in order to facilitate disease phenotyping and health studies. Objectives This study aims to review the literature of existing concept libraries, examine their utilities, identify the current gaps, and suggest future developments. Methods The five-stage framework of Arksey and O'Malley was used for the literature search. This approach included defining the research questions, identifying relevant studies through literature review, selecting eligible studies, charting and extracting data, and summarising and reporting the findings. Results This review identified seven publicly accessible Electronic Health data concept libraries which were developed in different countries including UK, USA, and Canada. The concept libraries (n = 7) investigated were either general libraries that hold phenotypes of multiple specialties (n = 4) or specialized libraries that manage only certain specialities such as rare diseases (n = 3). There were some clear differences between the general libraries such as archiving data from different electronic sources, and using a range of different types of coding systems. However, they share some clear similarities such as enabling users to upload their own code lists, and allowing users to use/download the publicly accessible code. In addition, there were some differences between the specialized libraries such as difference in ability to search, and if it was possible to use different searching queries such as simple or complex searches. Conversely, there were some similarities between the specialized libraries such as enabling users to upload their own concepts into the libraries and to show where they were published, which facilitates assessing the validity of the concepts. All the specialized libraries aimed to encourage the reuse of research methods such as lists of clinical code and/or metadata. Conclusion The seven libraries identified have been developed independently and appear to replicate similar concepts but in different ways. Collaboration between similar libraries would greatly facilitate the use of these libraries for the user. The process of building code lists takes time and effort. Access to existing code lists increases consistency and accuracy of definitions across studies. Concept library developers should collaborate with each other to raise awareness of their existence and of their various functions, which could increase users’ contributions to those libraries and promote their wide-ranging adoption. https://ijpds.org/article/view/1362
collection DOAJ
language English
format Article
sources DOAJ
author Zahra Ahmed Almowil
Shang-Ming Zhou
Sinead Brophy
spellingShingle Zahra Ahmed Almowil
Shang-Ming Zhou
Sinead Brophy
Concept Libraries for Automatic Electronic Health Record Based Phenotyping: A Review
International Journal of Population Data Science
author_facet Zahra Ahmed Almowil
Shang-Ming Zhou
Sinead Brophy
author_sort Zahra Ahmed Almowil
title Concept Libraries for Automatic Electronic Health Record Based Phenotyping: A Review
title_short Concept Libraries for Automatic Electronic Health Record Based Phenotyping: A Review
title_full Concept Libraries for Automatic Electronic Health Record Based Phenotyping: A Review
title_fullStr Concept Libraries for Automatic Electronic Health Record Based Phenotyping: A Review
title_full_unstemmed Concept Libraries for Automatic Electronic Health Record Based Phenotyping: A Review
title_sort concept libraries for automatic electronic health record based phenotyping: a review
publisher Swansea University
series International Journal of Population Data Science
issn 2399-4908
publishDate 2021-06-01
description Introduction Electronic health records (EHR) are linked together to examine disease history and to undertake research into the causes and outcomes of disease. However, the process of constructing algorithms for phenotyping (e.g., identifying disease characteristics) or health characteristics (e.g., smoker) is very time consuming and resource costly. In addition, results can vary greatly between researchers. Reusing or building on algorithms that others have created is a compelling solution to these problems. However, sharing algorithms is not a common practice and many published studies do not detail the clinical code lists used by the researchers in the disease/characteristic definition. To address these challenges, a number of centres across the world have developed health data portals which contain concept libraries (e.g., algorithms for defining concepts such as disease and characteristics) in order to facilitate disease phenotyping and health studies. Objectives This study aims to review the literature of existing concept libraries, examine their utilities, identify the current gaps, and suggest future developments. Methods The five-stage framework of Arksey and O'Malley was used for the literature search. This approach included defining the research questions, identifying relevant studies through literature review, selecting eligible studies, charting and extracting data, and summarising and reporting the findings. Results This review identified seven publicly accessible Electronic Health data concept libraries which were developed in different countries including UK, USA, and Canada. The concept libraries (n = 7) investigated were either general libraries that hold phenotypes of multiple specialties (n = 4) or specialized libraries that manage only certain specialities such as rare diseases (n = 3). There were some clear differences between the general libraries such as archiving data from different electronic sources, and using a range of different types of coding systems. However, they share some clear similarities such as enabling users to upload their own code lists, and allowing users to use/download the publicly accessible code. In addition, there were some differences between the specialized libraries such as difference in ability to search, and if it was possible to use different searching queries such as simple or complex searches. Conversely, there were some similarities between the specialized libraries such as enabling users to upload their own concepts into the libraries and to show where they were published, which facilitates assessing the validity of the concepts. All the specialized libraries aimed to encourage the reuse of research methods such as lists of clinical code and/or metadata. Conclusion The seven libraries identified have been developed independently and appear to replicate similar concepts but in different ways. Collaboration between similar libraries would greatly facilitate the use of these libraries for the user. The process of building code lists takes time and effort. Access to existing code lists increases consistency and accuracy of definitions across studies. Concept library developers should collaborate with each other to raise awareness of their existence and of their various functions, which could increase users’ contributions to those libraries and promote their wide-ranging adoption.
url https://ijpds.org/article/view/1362
work_keys_str_mv AT zahraahmedalmowil conceptlibrariesforautomaticelectronichealthrecordbasedphenotypingareview
AT shangmingzhou conceptlibrariesforautomaticelectronichealthrecordbasedphenotypingareview
AT sineadbrophy conceptlibrariesforautomaticelectronichealthrecordbasedphenotypingareview
_version_ 1721370971844116480