Efficient Identification of Patients Eligible for Clinical Studies Using Case-Based Reasoning on The Scottish Health Research Register (SHARE)
Introduction Trials often struggle to achieve their target sample size with only half doing so. Some researchers have turned to Electronic Health Records (EHRs), seeking a more efficient way of recruitment. The Scottish Health Research Register (SHARE) obtained patients’ consent for their EHRs to b...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Swansea University
2020-12-01
|
Series: | International Journal of Population Data Science |
Online Access: | https://ijpds.org/article/view/1509 |
id |
doaj-9602751547e946958698c9251e94ce11 |
---|---|
record_format |
Article |
spelling |
doaj-9602751547e946958698c9251e94ce112021-02-10T16:42:47ZengSwansea UniversityInternational Journal of Population Data Science2399-49082020-12-0155Efficient Identification of Patients Eligible for Clinical Studies Using Case-Based Reasoning on The Scottish Health Research Register (SHARE)Wen Shi0Tom Kelsey1Frank Sullivan2University of St. AndrewsUniversity of St. AndrewsUniversity of St. Andrews Introduction Trials often struggle to achieve their target sample size with only half doing so. Some researchers have turned to Electronic Health Records (EHRs), seeking a more efficient way of recruitment. The Scottish Health Research Register (SHARE) obtained patients’ consent for their EHRs to be used as a searching base from which researchers can find potential participants. However, due to the fact that EHR data is not complete, sufficient or accurate, a database search strategy may not generate the best case-finding result. Objectives and Approach A retrospective study was conducted to evaluate the performance of a case-based reasoning method in identifying participants for population-based clinical studies which had recruited through SHARE. A case-based reasoning framework was applied to nine studies with 119 total participants using two-fold cross-validation. Records of 30,000 random individuals were also merged with each test set to simulate the real-world recruitment setting. A prediction score for study participation was generated for each one in the test set through comparison of their diagnosis, procedure, pharmaceutical prescription, and laboratory test results attributes and those of the participants of a particular study. Evaluation was conducted by calculating Area Under the ROC Curve and information retrieval metrics for the ranking list of the test set by prediction score. We also compared the most likely participants as identified by searching a database to those ranked highest by our model. Results The average ROCAUC for nine projects was 81% indicating strong predictive ability. However, the derived ranking lists showed lower predictive performance. 21% of the persons ranked within top 50 positions being the same as identified by searching databases. Conclusion / Implications Case-based reasoning may be more effective than database search strategy for participant identification. This hypothesis requires a prospective study for further validation. The lower performance of ranking lists suggests improvements are needed in the collection and curation of EHRs. https://ijpds.org/article/view/1509 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Wen Shi Tom Kelsey Frank Sullivan |
spellingShingle |
Wen Shi Tom Kelsey Frank Sullivan Efficient Identification of Patients Eligible for Clinical Studies Using Case-Based Reasoning on The Scottish Health Research Register (SHARE) International Journal of Population Data Science |
author_facet |
Wen Shi Tom Kelsey Frank Sullivan |
author_sort |
Wen Shi |
title |
Efficient Identification of Patients Eligible for Clinical Studies Using Case-Based Reasoning on The Scottish Health Research Register (SHARE) |
title_short |
Efficient Identification of Patients Eligible for Clinical Studies Using Case-Based Reasoning on The Scottish Health Research Register (SHARE) |
title_full |
Efficient Identification of Patients Eligible for Clinical Studies Using Case-Based Reasoning on The Scottish Health Research Register (SHARE) |
title_fullStr |
Efficient Identification of Patients Eligible for Clinical Studies Using Case-Based Reasoning on The Scottish Health Research Register (SHARE) |
title_full_unstemmed |
Efficient Identification of Patients Eligible for Clinical Studies Using Case-Based Reasoning on The Scottish Health Research Register (SHARE) |
title_sort |
efficient identification of patients eligible for clinical studies using case-based reasoning on the scottish health research register (share) |
publisher |
Swansea University |
series |
International Journal of Population Data Science |
issn |
2399-4908 |
publishDate |
2020-12-01 |
description |
Introduction
Trials often struggle to achieve their target sample size with only half doing so. Some researchers have turned to Electronic Health Records (EHRs), seeking a more efficient way of recruitment. The Scottish Health Research Register (SHARE) obtained patients’ consent for their EHRs to be used as a searching base from which researchers can find potential participants. However, due to the fact that EHR data is not complete, sufficient or accurate, a database search strategy may not generate the best case-finding result.
Objectives and Approach
A retrospective study was conducted to evaluate the performance of a case-based reasoning method in identifying participants for population-based clinical studies which had recruited through SHARE. A case-based reasoning framework was applied to nine studies with 119 total participants using two-fold cross-validation. Records of 30,000 random individuals were also merged with each test set to simulate the real-world recruitment setting. A prediction score for study participation was generated for each one in the test set through comparison of their diagnosis, procedure, pharmaceutical prescription, and laboratory test results attributes and those of the participants of a particular study. Evaluation was conducted by calculating Area Under the ROC Curve and information retrieval metrics for the ranking list of the test set by prediction score. We also compared the most likely participants as identified by searching a database to those ranked highest by our model.
Results
The average ROCAUC for nine projects was 81% indicating strong predictive ability. However, the derived ranking lists showed lower predictive performance. 21% of the persons ranked within top 50 positions being the same as identified by searching databases.
Conclusion / Implications
Case-based reasoning may be more effective than database search strategy for participant identification. This hypothesis requires a prospective study for further validation. The lower performance of ranking lists suggests improvements are needed in the collection and curation of EHRs.
|
url |
https://ijpds.org/article/view/1509 |
work_keys_str_mv |
AT wenshi efficientidentificationofpatientseligibleforclinicalstudiesusingcasebasedreasoningonthescottishhealthresearchregistershare AT tomkelsey efficientidentificationofpatientseligibleforclinicalstudiesusingcasebasedreasoningonthescottishhealthresearchregistershare AT franksullivan efficientidentificationofpatientseligibleforclinicalstudiesusingcasebasedreasoningonthescottishhealthresearchregistershare |
_version_ |
1724275188055408640 |