Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study

Abstract Objective Assessing risks of bias in randomized controlled trials (RCTs) is an important but laborious task when conducting systematic reviews. RobotReviewer (RR), an open-source machine learning (ML) system, semi-automates bias assessments. We conducted a user study of RobotReviewer, evalu...

Full description

Bibliographic Details
Main Authors: Frank Soboczenski, Thomas A. Trikalinos, Joël Kuiper, Randolph G. Bias, Byron C. Wallace, Iain J. Marshall
Format: Article
Language:English
Published: BMC 2019-05-01
Series:BMC Medical Informatics and Decision Making
Online Access:http://link.springer.com/article/10.1186/s12911-019-0814-z
id doaj-2439d8044ca64884b24df73ffaa3c92c
record_format Article
spelling doaj-2439d8044ca64884b24df73ffaa3c92c2020-11-25T02:13:44ZengBMCBMC Medical Informatics and Decision Making1472-69472019-05-0119111210.1186/s12911-019-0814-zMachine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user studyFrank Soboczenski0Thomas A. Trikalinos1Joël Kuiper2Randolph G. Bias3Byron C. Wallace4Iain J. Marshall5School of Population Health & Environmental Sciences, Faculty of Life Sciences and Medicine, King’s College LondonCenter for Evidence Synthesis in Health, Brown UniversityVortext SystemsSchool of Information, University of Texas at AustinKhoury College of Computer Sciences, Northeastern UniversitySchool of Population Health & Environmental Sciences, Faculty of Life Sciences and Medicine, King’s College LondonAbstract Objective Assessing risks of bias in randomized controlled trials (RCTs) is an important but laborious task when conducting systematic reviews. RobotReviewer (RR), an open-source machine learning (ML) system, semi-automates bias assessments. We conducted a user study of RobotReviewer, evaluating time saved and usability of the tool. Materials and methods Systematic reviewers applied the Cochrane Risk of Bias tool to four randomly selected RCT articles. Reviewers judged: whether an RCT was at low, or high/unclear risk of bias for each bias domain in the Cochrane tool (Version 1); and highlighted article text justifying their decision. For a random two of the four articles, the process was semi-automated: users were provided with ML-suggested bias judgments and text highlights. Participants could amend the suggestions if necessary. We measured time taken for the task, ML suggestions, usability via the System Usability Scale (SUS) and collected qualitative feedback. Results For 41 volunteers, semi-automation was quicker than manual assessment (mean 755 vs. 824 s; relative time 0.75, 95% CI 0.62–0.92). Reviewers accepted 301/328 (91%) of the ML Risk of Bias (RoB) judgments, and 202/328 (62%) of text highlights without change. Overall, ML suggested text highlights had a recall of 0.90 (SD 0.14) and precision of 0.87 (SD 0.21) with respect to the users’ final versions. Reviewers assigned the system a mean 77.7 SUS score, corresponding to a rating between “good” and “excellent”. Conclusions Semi-automation (where humans validate machine learning suggestions) can improve the efficiency of evidence synthesis. Our system was rated highly usable, and expedited bias assessment of RCTs.http://link.springer.com/article/10.1186/s12911-019-0814-z
collection DOAJ
language English
format Article
sources DOAJ
author Frank Soboczenski
Thomas A. Trikalinos
Joël Kuiper
Randolph G. Bias
Byron C. Wallace
Iain J. Marshall
spellingShingle Frank Soboczenski
Thomas A. Trikalinos
Joël Kuiper
Randolph G. Bias
Byron C. Wallace
Iain J. Marshall
Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study
BMC Medical Informatics and Decision Making
author_facet Frank Soboczenski
Thomas A. Trikalinos
Joël Kuiper
Randolph G. Bias
Byron C. Wallace
Iain J. Marshall
author_sort Frank Soboczenski
title Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study
title_short Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study
title_full Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study
title_fullStr Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study
title_full_unstemmed Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study
title_sort machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study
publisher BMC
series BMC Medical Informatics and Decision Making
issn 1472-6947
publishDate 2019-05-01
description Abstract Objective Assessing risks of bias in randomized controlled trials (RCTs) is an important but laborious task when conducting systematic reviews. RobotReviewer (RR), an open-source machine learning (ML) system, semi-automates bias assessments. We conducted a user study of RobotReviewer, evaluating time saved and usability of the tool. Materials and methods Systematic reviewers applied the Cochrane Risk of Bias tool to four randomly selected RCT articles. Reviewers judged: whether an RCT was at low, or high/unclear risk of bias for each bias domain in the Cochrane tool (Version 1); and highlighted article text justifying their decision. For a random two of the four articles, the process was semi-automated: users were provided with ML-suggested bias judgments and text highlights. Participants could amend the suggestions if necessary. We measured time taken for the task, ML suggestions, usability via the System Usability Scale (SUS) and collected qualitative feedback. Results For 41 volunteers, semi-automation was quicker than manual assessment (mean 755 vs. 824 s; relative time 0.75, 95% CI 0.62–0.92). Reviewers accepted 301/328 (91%) of the ML Risk of Bias (RoB) judgments, and 202/328 (62%) of text highlights without change. Overall, ML suggested text highlights had a recall of 0.90 (SD 0.14) and precision of 0.87 (SD 0.21) with respect to the users’ final versions. Reviewers assigned the system a mean 77.7 SUS score, corresponding to a rating between “good” and “excellent”. Conclusions Semi-automation (where humans validate machine learning suggestions) can improve the efficiency of evidence synthesis. Our system was rated highly usable, and expedited bias assessment of RCTs.
url http://link.springer.com/article/10.1186/s12911-019-0814-z
work_keys_str_mv AT franksoboczenski machinelearningtohelpresearchersevaluatebiasesinclinicaltrialsaprospectiverandomizeduserstudy
AT thomasatrikalinos machinelearningtohelpresearchersevaluatebiasesinclinicaltrialsaprospectiverandomizeduserstudy
AT joelkuiper machinelearningtohelpresearchersevaluatebiasesinclinicaltrialsaprospectiverandomizeduserstudy
AT randolphgbias machinelearningtohelpresearchersevaluatebiasesinclinicaltrialsaprospectiverandomizeduserstudy
AT byroncwallace machinelearningtohelpresearchersevaluatebiasesinclinicaltrialsaprospectiverandomizeduserstudy
AT iainjmarshall machinelearningtohelpresearchersevaluatebiasesinclinicaltrialsaprospectiverandomizeduserstudy
_version_ 1724903372481363968