Statistical stopping criteria for automated screening in systematic reviews
Abstract Active learning for systematic review screening promises to reduce the human effort required to identify relevant documents for a systematic review. Machines and humans work together, with humans providing training data, and the machine optimising the documents that the humans screen. This...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2020-11-01
|
Series: | Systematic Reviews |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13643-020-01521-4 |
id |
doaj-3e15c7df2ff647afab5e32dacfb7a350 |
---|---|
record_format |
Article |
spelling |
doaj-3e15c7df2ff647afab5e32dacfb7a3502020-11-29T12:05:00ZengBMCSystematic Reviews2046-40532020-11-019111410.1186/s13643-020-01521-4Statistical stopping criteria for automated screening in systematic reviewsMax W Callaghan0Finn Müller-Hansen1Mercator Research Institute on Global Commons and Climate ChangeMercator Research Institute on Global Commons and Climate ChangeAbstract Active learning for systematic review screening promises to reduce the human effort required to identify relevant documents for a systematic review. Machines and humans work together, with humans providing training data, and the machine optimising the documents that the humans screen. This enables the identification of all relevant documents after viewing only a fraction of the total documents. However, current approaches lack robust stopping criteria, so that reviewers do not know when they have seen all or a certain proportion of relevant documents. This means that such systems are hard to implement in live reviews. This paper introduces a workflow with flexible statistical stopping criteria, which offer real work reductions on the basis of rejecting a hypothesis of having missed a given recall target with a given level of confidence. The stopping criteria are shown on test datasets to achieve a reliable level of recall, while still providing work reductions of on average 17%. Other methods proposed previously are shown to provide inconsistent recall and work reductions across datasets.https://doi.org/10.1186/s13643-020-01521-4Systematic reviewMachine learningActive learningStopping criteria |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Max W Callaghan Finn Müller-Hansen |
spellingShingle |
Max W Callaghan Finn Müller-Hansen Statistical stopping criteria for automated screening in systematic reviews Systematic Reviews Systematic review Machine learning Active learning Stopping criteria |
author_facet |
Max W Callaghan Finn Müller-Hansen |
author_sort |
Max W Callaghan |
title |
Statistical stopping criteria for automated screening in systematic reviews |
title_short |
Statistical stopping criteria for automated screening in systematic reviews |
title_full |
Statistical stopping criteria for automated screening in systematic reviews |
title_fullStr |
Statistical stopping criteria for automated screening in systematic reviews |
title_full_unstemmed |
Statistical stopping criteria for automated screening in systematic reviews |
title_sort |
statistical stopping criteria for automated screening in systematic reviews |
publisher |
BMC |
series |
Systematic Reviews |
issn |
2046-4053 |
publishDate |
2020-11-01 |
description |
Abstract Active learning for systematic review screening promises to reduce the human effort required to identify relevant documents for a systematic review. Machines and humans work together, with humans providing training data, and the machine optimising the documents that the humans screen. This enables the identification of all relevant documents after viewing only a fraction of the total documents. However, current approaches lack robust stopping criteria, so that reviewers do not know when they have seen all or a certain proportion of relevant documents. This means that such systems are hard to implement in live reviews. This paper introduces a workflow with flexible statistical stopping criteria, which offer real work reductions on the basis of rejecting a hypothesis of having missed a given recall target with a given level of confidence. The stopping criteria are shown on test datasets to achieve a reliable level of recall, while still providing work reductions of on average 17%. Other methods proposed previously are shown to provide inconsistent recall and work reductions across datasets. |
topic |
Systematic review Machine learning Active learning Stopping criteria |
url |
https://doi.org/10.1186/s13643-020-01521-4 |
work_keys_str_mv |
AT maxwcallaghan statisticalstoppingcriteriaforautomatedscreeninginsystematicreviews AT finnmullerhansen statisticalstoppingcriteriaforautomatedscreeninginsystematicreviews |
_version_ |
1724412247777738752 |