ORBDA: An openEHR benchmark dataset for performance assessment of electronic health record servers.

The openEHR specifications are designed to support implementation of flexible and interoperable Electronic Health Record (EHR) systems. Despite the increasing number of solutions based on the openEHR specifications, it is difficult to find publicly available healthcare datasets in the openEHR format...

Full description

Bibliographic Details
Main Authors: Douglas Teodoro, Erik Sundvall, Mario João Junior, Patrick Ruch, Sergio Miranda Freire
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2018-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5749730?pdf=render
id doaj-14858f8f0d8d4013947fc503a7117639
record_format Article
spelling doaj-14858f8f0d8d4013947fc503a71176392020-11-25T02:29:05ZengPublic Library of Science (PLoS)PLoS ONE1932-62032018-01-01131e019002810.1371/journal.pone.0190028ORBDA: An openEHR benchmark dataset for performance assessment of electronic health record servers.Douglas TeodoroErik SundvallMario João JuniorPatrick RuchSergio Miranda FreireThe openEHR specifications are designed to support implementation of flexible and interoperable Electronic Health Record (EHR) systems. Despite the increasing number of solutions based on the openEHR specifications, it is difficult to find publicly available healthcare datasets in the openEHR format that can be used to test, compare and validate different data persistence mechanisms for openEHR. To foster research on openEHR servers, we present the openEHR Benchmark Dataset, ORBDA, a very large healthcare benchmark dataset encoded using the openEHR formalism. To construct ORBDA, we extracted and cleaned a de-identified dataset from the Brazilian National Healthcare System (SUS) containing hospitalisation and high complexity procedures information and formalised it using a set of openEHR archetypes and templates. Then, we implemented a tool to enrich the raw relational data and convert it into the openEHR model using the openEHR Java reference model library. The ORBDA dataset is available in composition, versioned composition and EHR openEHR representations in XML and JSON formats. In total, the dataset contains more than 150 million composition records. We describe the dataset and provide means to access it. Additionally, we demonstrate the usage of ORBDA for evaluating inserting throughput and query latency performances of some NoSQL database management systems. We believe that ORBDA is a valuable asset for assessing storage models for openEHR-based information systems during the software engineering process. It may also be a suitable component in future standardised benchmarking of available openEHR storage platforms.http://europepmc.org/articles/PMC5749730?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Douglas Teodoro
Erik Sundvall
Mario João Junior
Patrick Ruch
Sergio Miranda Freire
spellingShingle Douglas Teodoro
Erik Sundvall
Mario João Junior
Patrick Ruch
Sergio Miranda Freire
ORBDA: An openEHR benchmark dataset for performance assessment of electronic health record servers.
PLoS ONE
author_facet Douglas Teodoro
Erik Sundvall
Mario João Junior
Patrick Ruch
Sergio Miranda Freire
author_sort Douglas Teodoro
title ORBDA: An openEHR benchmark dataset for performance assessment of electronic health record servers.
title_short ORBDA: An openEHR benchmark dataset for performance assessment of electronic health record servers.
title_full ORBDA: An openEHR benchmark dataset for performance assessment of electronic health record servers.
title_fullStr ORBDA: An openEHR benchmark dataset for performance assessment of electronic health record servers.
title_full_unstemmed ORBDA: An openEHR benchmark dataset for performance assessment of electronic health record servers.
title_sort orbda: an openehr benchmark dataset for performance assessment of electronic health record servers.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2018-01-01
description The openEHR specifications are designed to support implementation of flexible and interoperable Electronic Health Record (EHR) systems. Despite the increasing number of solutions based on the openEHR specifications, it is difficult to find publicly available healthcare datasets in the openEHR format that can be used to test, compare and validate different data persistence mechanisms for openEHR. To foster research on openEHR servers, we present the openEHR Benchmark Dataset, ORBDA, a very large healthcare benchmark dataset encoded using the openEHR formalism. To construct ORBDA, we extracted and cleaned a de-identified dataset from the Brazilian National Healthcare System (SUS) containing hospitalisation and high complexity procedures information and formalised it using a set of openEHR archetypes and templates. Then, we implemented a tool to enrich the raw relational data and convert it into the openEHR model using the openEHR Java reference model library. The ORBDA dataset is available in composition, versioned composition and EHR openEHR representations in XML and JSON formats. In total, the dataset contains more than 150 million composition records. We describe the dataset and provide means to access it. Additionally, we demonstrate the usage of ORBDA for evaluating inserting throughput and query latency performances of some NoSQL database management systems. We believe that ORBDA is a valuable asset for assessing storage models for openEHR-based information systems during the software engineering process. It may also be a suitable component in future standardised benchmarking of available openEHR storage platforms.
url http://europepmc.org/articles/PMC5749730?pdf=render
work_keys_str_mv AT douglasteodoro orbdaanopenehrbenchmarkdatasetforperformanceassessmentofelectronichealthrecordservers
AT eriksundvall orbdaanopenehrbenchmarkdatasetforperformanceassessmentofelectronichealthrecordservers
AT mariojoaojunior orbdaanopenehrbenchmarkdatasetforperformanceassessmentofelectronichealthrecordservers
AT patrickruch orbdaanopenehrbenchmarkdatasetforperformanceassessmentofelectronichealthrecordservers
AT sergiomirandafreire orbdaanopenehrbenchmarkdatasetforperformanceassessmentofelectronichealthrecordservers
_version_ 1724834633738092544