Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity

Abstract Background Google Trends (GT) is being used as an epidemiological tool to study coronavirus disease (COVID-19) by identifying keywords in search trends that are predictive for the COVID-19 epidemiological burden. However, many of the earlier GT-based studies include potential statistical fa...

Full description

Bibliographic Details
Main Authors:	Kenichiro Sato, Tatsuo Mano, Atsushi Iwata, Tatsushi Toda
Format:	Article
Language:	English
Published:	BMC 2021-07-01
Series:	BMC Medical Research Methodology
Subjects:	COVID-19 Google Trends Infodemiology Vector autoregression model Granger causality
Online Access:	https://doi.org/10.1186/s12874-021-01338-2

id	doaj-40e558dab78349579802d6437157d3e0
record_format	Article
spelling	doaj-40e558dab78349579802d6437157d3e02021-07-18T11:48:34ZengBMCBMC Medical Research Methodology1471-22882021-07-0121111010.1186/s12874-021-01338-2Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivityKenichiro Sato0Tatsuo Mano1Atsushi Iwata2Tatsushi Toda3Department of Neurology, Graduate School of Medicine, University of TokyoDepartment of Neurology, Graduate School of Medicine, University of TokyoDepartment of Neurology, Graduate School of Medicine, University of TokyoDepartment of Neurology, Graduate School of Medicine, University of TokyoAbstract Background Google Trends (GT) is being used as an epidemiological tool to study coronavirus disease (COVID-19) by identifying keywords in search trends that are predictive for the COVID-19 epidemiological burden. However, many of the earlier GT-based studies include potential statistical fallacies by measuring the correlation between non-stationary time sequences without adjusting for multiple comparisons or the confounding of media coverage, leading to concerns about the increased risk of obtaining false-positive results. In this study, we aimed to apply statistically more favorable methods to validate the earlier GT-based COVID-19 study results. Methods We extracted the relative GT search volume for keywords associated with COVID-19 symptoms, and evaluated their Granger-causality to weekly COVID-19 positivity in eight English-speaking countries and Japan. In addition, the impact of media coverage on keywords with significant Granger-causality was further evaluated using Japanese regional data. Results Our Granger causality-based approach largely decreased (by up to approximately one-third) the number of keywords identified as having a significant temporal relationship with the COVID-19 trend when compared to those identified by Pearson or Spearman’s rank correlation-based approach. “Sense of smell” and “loss of smell” were the most reliable GT keywords across all the evaluated countries; however, when adjusted with their media coverage, these keyword trends did not Granger-cause the COVID-19 positivity trends (in Japan). Conclusions Our results suggest that some of the search keywords reported as candidate predictive measures in earlier GT-based COVID-19 studies may potentially be unreliable; therefore, caution is necessary when interpreting published GT-based study results.https://doi.org/10.1186/s12874-021-01338-2COVID-19Google TrendsInfodemiologyVector autoregression modelGranger causality
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Kenichiro Sato Tatsuo Mano Atsushi Iwata Tatsushi Toda
spellingShingle	Kenichiro Sato Tatsuo Mano Atsushi Iwata Tatsushi Toda Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity BMC Medical Research Methodology COVID-19 Google Trends Infodemiology Vector autoregression model Granger causality
author_facet	Kenichiro Sato Tatsuo Mano Atsushi Iwata Tatsushi Toda
author_sort	Kenichiro Sato
title	Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity
title_short	Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity
title_full	Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity
title_fullStr	Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity
title_full_unstemmed	Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity
title_sort	need of care in interpreting google trends-based covid-19 infodemiological study results: potential risk of false-positivity
publisher	BMC
series	BMC Medical Research Methodology
issn	1471-2288
publishDate	2021-07-01
description	Abstract Background Google Trends (GT) is being used as an epidemiological tool to study coronavirus disease (COVID-19) by identifying keywords in search trends that are predictive for the COVID-19 epidemiological burden. However, many of the earlier GT-based studies include potential statistical fallacies by measuring the correlation between non-stationary time sequences without adjusting for multiple comparisons or the confounding of media coverage, leading to concerns about the increased risk of obtaining false-positive results. In this study, we aimed to apply statistically more favorable methods to validate the earlier GT-based COVID-19 study results. Methods We extracted the relative GT search volume for keywords associated with COVID-19 symptoms, and evaluated their Granger-causality to weekly COVID-19 positivity in eight English-speaking countries and Japan. In addition, the impact of media coverage on keywords with significant Granger-causality was further evaluated using Japanese regional data. Results Our Granger causality-based approach largely decreased (by up to approximately one-third) the number of keywords identified as having a significant temporal relationship with the COVID-19 trend when compared to those identified by Pearson or Spearman’s rank correlation-based approach. “Sense of smell” and “loss of smell” were the most reliable GT keywords across all the evaluated countries; however, when adjusted with their media coverage, these keyword trends did not Granger-cause the COVID-19 positivity trends (in Japan). Conclusions Our results suggest that some of the search keywords reported as candidate predictive measures in earlier GT-based COVID-19 studies may potentially be unreliable; therefore, caution is necessary when interpreting published GT-based study results.
topic	COVID-19 Google Trends Infodemiology Vector autoregression model Granger causality
url	https://doi.org/10.1186/s12874-021-01338-2
work_keys_str_mv	AT kenichirosato needofcareininterpretinggoogletrendsbasedcovid19infodemiologicalstudyresultspotentialriskoffalsepositivity AT tatsuomano needofcareininterpretinggoogletrendsbasedcovid19infodemiologicalstudyresultspotentialriskoffalsepositivity AT atsushiiwata needofcareininterpretinggoogletrendsbasedcovid19infodemiologicalstudyresultspotentialriskoffalsepositivity AT tatsushitoda needofcareininterpretinggoogletrendsbasedcovid19infodemiologicalstudyresultspotentialriskoffalsepositivity
_version_	1721295769775898624

Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity

Similar Items