Real-time estimation of disease activity in emerging outbreaks using internet search information.

Understanding the behavior of emerging disease outbreaks in, or ahead of, real-time could help healthcare officials better design interventions to mitigate impacts on affected populations. Most healthcare-based disease surveillance systems, however, have significant inherent reporting delays due to...

Full description

Bibliographic Details
Main Authors: Emily L Aiken, Sarah F McGough, Maimuna S Majumder, Gal Wachtel, Andre T Nguyen, Cecile Viboud, Mauricio Santillana
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-08-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1008117
id doaj-1cd74c97bb164125b89866d155d302ff
record_format Article
spelling doaj-1cd74c97bb164125b89866d155d302ff2021-07-12T04:31:31ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582020-08-01168e100811710.1371/journal.pcbi.1008117Real-time estimation of disease activity in emerging outbreaks using internet search information.Emily L AikenSarah F McGoughMaimuna S MajumderGal WachtelAndre T NguyenCecile ViboudMauricio SantillanaUnderstanding the behavior of emerging disease outbreaks in, or ahead of, real-time could help healthcare officials better design interventions to mitigate impacts on affected populations. Most healthcare-based disease surveillance systems, however, have significant inherent reporting delays due to data collection, aggregation, and distribution processes. Recent work has shown that machine learning methods leveraging a combination of traditionally collected epidemiological information and novel Internet-based data sources, such as disease-related Internet search activity, can produce meaningful "nowcasts" of disease incidence ahead of healthcare-based estimates, with most successful case studies focusing on endemic and seasonal diseases such as influenza and dengue. Here, we apply similar computational methods to emerging outbreaks in geographic regions where no historical presence of the disease of interest has been observed. By combining limited available historical epidemiological data available with disease-related Internet search activity, we retrospectively estimate disease activity in five recent outbreaks weeks ahead of traditional surveillance methods. We find that the proposed computational methods frequently provide useful real-time incidence estimates that can help fill temporal data gaps resulting from surveillance reporting delays. However, the proposed methods are limited by issues of sample bias and skew in search query volumes, perhaps as a result of media coverage.https://doi.org/10.1371/journal.pcbi.1008117
collection DOAJ
language English
format Article
sources DOAJ
author Emily L Aiken
Sarah F McGough
Maimuna S Majumder
Gal Wachtel
Andre T Nguyen
Cecile Viboud
Mauricio Santillana
spellingShingle Emily L Aiken
Sarah F McGough
Maimuna S Majumder
Gal Wachtel
Andre T Nguyen
Cecile Viboud
Mauricio Santillana
Real-time estimation of disease activity in emerging outbreaks using internet search information.
PLoS Computational Biology
author_facet Emily L Aiken
Sarah F McGough
Maimuna S Majumder
Gal Wachtel
Andre T Nguyen
Cecile Viboud
Mauricio Santillana
author_sort Emily L Aiken
title Real-time estimation of disease activity in emerging outbreaks using internet search information.
title_short Real-time estimation of disease activity in emerging outbreaks using internet search information.
title_full Real-time estimation of disease activity in emerging outbreaks using internet search information.
title_fullStr Real-time estimation of disease activity in emerging outbreaks using internet search information.
title_full_unstemmed Real-time estimation of disease activity in emerging outbreaks using internet search information.
title_sort real-time estimation of disease activity in emerging outbreaks using internet search information.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2020-08-01
description Understanding the behavior of emerging disease outbreaks in, or ahead of, real-time could help healthcare officials better design interventions to mitigate impacts on affected populations. Most healthcare-based disease surveillance systems, however, have significant inherent reporting delays due to data collection, aggregation, and distribution processes. Recent work has shown that machine learning methods leveraging a combination of traditionally collected epidemiological information and novel Internet-based data sources, such as disease-related Internet search activity, can produce meaningful "nowcasts" of disease incidence ahead of healthcare-based estimates, with most successful case studies focusing on endemic and seasonal diseases such as influenza and dengue. Here, we apply similar computational methods to emerging outbreaks in geographic regions where no historical presence of the disease of interest has been observed. By combining limited available historical epidemiological data available with disease-related Internet search activity, we retrospectively estimate disease activity in five recent outbreaks weeks ahead of traditional surveillance methods. We find that the proposed computational methods frequently provide useful real-time incidence estimates that can help fill temporal data gaps resulting from surveillance reporting delays. However, the proposed methods are limited by issues of sample bias and skew in search query volumes, perhaps as a result of media coverage.
url https://doi.org/10.1371/journal.pcbi.1008117
work_keys_str_mv AT emilylaiken realtimeestimationofdiseaseactivityinemergingoutbreaksusinginternetsearchinformation
AT sarahfmcgough realtimeestimationofdiseaseactivityinemergingoutbreaksusinginternetsearchinformation
AT maimunasmajumder realtimeestimationofdiseaseactivityinemergingoutbreaksusinginternetsearchinformation
AT galwachtel realtimeestimationofdiseaseactivityinemergingoutbreaksusinginternetsearchinformation
AT andretnguyen realtimeestimationofdiseaseactivityinemergingoutbreaksusinginternetsearchinformation
AT cecileviboud realtimeestimationofdiseaseactivityinemergingoutbreaksusinginternetsearchinformation
AT mauriciosantillana realtimeestimationofdiseaseactivityinemergingoutbreaksusinginternetsearchinformation
_version_ 1721307866667679744