Searching PubMed to Retrieve Publications on the COVID-19 Pandemic: Comparative Analysis of Search Strings

BackgroundSince it was declared a pandemic on March 11, 2020, COVID-19 has dominated headlines around the world and researchers have generated thousands of scientific articles about the disease. The fast speed of publication has challenged researchers and other stakeholders t...

Full description

Bibliographic Details
Main Authors: Lazarus, Jeffrey V, Palayew, Adam, Rasmussen, Lauge Neimann, Andersen, Tue Helms, Nicholson, Joey, Norgaard, Ole
Format: Article
Language:English
Published: JMIR Publications 2020-11-01
Series:Journal of Medical Internet Research
Online Access:http://www.jmir.org/2020/11/e23449/
id doaj-08549ff52e1245afa13f731dad8baf09
record_format Article
spelling doaj-08549ff52e1245afa13f731dad8baf092021-04-02T19:00:50ZengJMIR PublicationsJournal of Medical Internet Research1438-88712020-11-012211e2344910.2196/23449Searching PubMed to Retrieve Publications on the COVID-19 Pandemic: Comparative Analysis of Search StringsLazarus, Jeffrey VPalayew, AdamRasmussen, Lauge NeimannAndersen, Tue HelmsNicholson, JoeyNorgaard, Ole BackgroundSince it was declared a pandemic on March 11, 2020, COVID-19 has dominated headlines around the world and researchers have generated thousands of scientific articles about the disease. The fast speed of publication has challenged researchers and other stakeholders to keep up with the volume of published articles. To search the literature effectively, researchers use databases such as PubMed. ObjectiveThe aim of this study is to evaluate the performance of different searches for COVID-19 records in PubMed and to assess the complexity of searches required. MethodsWe tested PubMed searches for COVID-19 to identify which search string performed best according to standard metrics (sensitivity, precision, and F-score). We evaluated the performance of 8 different searches in PubMed during the first 10 weeks of the COVID-19 pandemic to investigate how complex a search string is needed. We also tested omitting hyphens and space characters as well as applying quotation marks. ResultsThe two most comprehensive search strings combining several free-text and indexed search terms performed best in terms of sensitivity (98.4%/98.7%) and F-score (96.5%/95.7%), but the single-term search COVID-19 performed best in terms of precision (95.3%) and well in terms of sensitivity (94.4%) and F-score (94.8%). The term Wuhan virus performed the worst: 7.7% for sensitivity, 78.1% for precision, and 14.0% for F-score. We found that deleting a hyphen or space character could omit a substantial number of records, especially when searching with SARS-CoV-2 as a single term. ConclusionsComprehensive search strings combining free-text and indexed search terms performed better than single-term searches in PubMed, but not by a large margin compared to the single term COVID-19. For everyday searches, certain single-term searches that are entered correctly are probably sufficient, whereas more comprehensive searches should be used for systematic reviews. Still, we suggest additional measures that the US National Library of Medicine could take to support all PubMed users in searching the COVID-19 literature.http://www.jmir.org/2020/11/e23449/
collection DOAJ
language English
format Article
sources DOAJ
author Lazarus, Jeffrey V
Palayew, Adam
Rasmussen, Lauge Neimann
Andersen, Tue Helms
Nicholson, Joey
Norgaard, Ole
spellingShingle Lazarus, Jeffrey V
Palayew, Adam
Rasmussen, Lauge Neimann
Andersen, Tue Helms
Nicholson, Joey
Norgaard, Ole
Searching PubMed to Retrieve Publications on the COVID-19 Pandemic: Comparative Analysis of Search Strings
Journal of Medical Internet Research
author_facet Lazarus, Jeffrey V
Palayew, Adam
Rasmussen, Lauge Neimann
Andersen, Tue Helms
Nicholson, Joey
Norgaard, Ole
author_sort Lazarus, Jeffrey V
title Searching PubMed to Retrieve Publications on the COVID-19 Pandemic: Comparative Analysis of Search Strings
title_short Searching PubMed to Retrieve Publications on the COVID-19 Pandemic: Comparative Analysis of Search Strings
title_full Searching PubMed to Retrieve Publications on the COVID-19 Pandemic: Comparative Analysis of Search Strings
title_fullStr Searching PubMed to Retrieve Publications on the COVID-19 Pandemic: Comparative Analysis of Search Strings
title_full_unstemmed Searching PubMed to Retrieve Publications on the COVID-19 Pandemic: Comparative Analysis of Search Strings
title_sort searching pubmed to retrieve publications on the covid-19 pandemic: comparative analysis of search strings
publisher JMIR Publications
series Journal of Medical Internet Research
issn 1438-8871
publishDate 2020-11-01
description BackgroundSince it was declared a pandemic on March 11, 2020, COVID-19 has dominated headlines around the world and researchers have generated thousands of scientific articles about the disease. The fast speed of publication has challenged researchers and other stakeholders to keep up with the volume of published articles. To search the literature effectively, researchers use databases such as PubMed. ObjectiveThe aim of this study is to evaluate the performance of different searches for COVID-19 records in PubMed and to assess the complexity of searches required. MethodsWe tested PubMed searches for COVID-19 to identify which search string performed best according to standard metrics (sensitivity, precision, and F-score). We evaluated the performance of 8 different searches in PubMed during the first 10 weeks of the COVID-19 pandemic to investigate how complex a search string is needed. We also tested omitting hyphens and space characters as well as applying quotation marks. ResultsThe two most comprehensive search strings combining several free-text and indexed search terms performed best in terms of sensitivity (98.4%/98.7%) and F-score (96.5%/95.7%), but the single-term search COVID-19 performed best in terms of precision (95.3%) and well in terms of sensitivity (94.4%) and F-score (94.8%). The term Wuhan virus performed the worst: 7.7% for sensitivity, 78.1% for precision, and 14.0% for F-score. We found that deleting a hyphen or space character could omit a substantial number of records, especially when searching with SARS-CoV-2 as a single term. ConclusionsComprehensive search strings combining free-text and indexed search terms performed better than single-term searches in PubMed, but not by a large margin compared to the single term COVID-19. For everyday searches, certain single-term searches that are entered correctly are probably sufficient, whereas more comprehensive searches should be used for systematic reviews. Still, we suggest additional measures that the US National Library of Medicine could take to support all PubMed users in searching the COVID-19 literature.
url http://www.jmir.org/2020/11/e23449/
work_keys_str_mv AT lazarusjeffreyv searchingpubmedtoretrievepublicationsonthecovid19pandemiccomparativeanalysisofsearchstrings
AT palayewadam searchingpubmedtoretrievepublicationsonthecovid19pandemiccomparativeanalysisofsearchstrings
AT rasmussenlaugeneimann searchingpubmedtoretrievepublicationsonthecovid19pandemiccomparativeanalysisofsearchstrings
AT andersentuehelms searchingpubmedtoretrievepublicationsonthecovid19pandemiccomparativeanalysisofsearchstrings
AT nicholsonjoey searchingpubmedtoretrievepublicationsonthecovid19pandemiccomparativeanalysisofsearchstrings
AT norgaardole searchingpubmedtoretrievepublicationsonthecovid19pandemiccomparativeanalysisofsearchstrings
_version_ 1721549920158089216