Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time.

Circulating levels of both seasonal and pandemic influenza require constant surveillance to ensure the health and safety of the population. While up-to-date information is critical, traditional surveillance systems can have data availability lags of up to two weeks. We introduce a novel method of es...

Full description

Bibliographic Details
Main Authors: David J McIver, John S Brownstein
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-04-01
Series:PLoS Computational Biology
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24743682/pdf/?tool=EBI
id doaj-efd53259183141b1bb9e69e63aa62623
record_format Article
spelling doaj-efd53259183141b1bb9e69e63aa626232021-04-21T15:36:04ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582014-04-01104e100358110.1371/journal.pcbi.1003581Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time.David J McIverJohn S BrownsteinCirculating levels of both seasonal and pandemic influenza require constant surveillance to ensure the health and safety of the population. While up-to-date information is critical, traditional surveillance systems can have data availability lags of up to two weeks. We introduce a novel method of estimating, in near-real time, the level of influenza-like illness (ILI) in the United States (US) by monitoring the rate of particular Wikipedia article views on a daily basis. We calculated the number of times certain influenza- or health-related Wikipedia articles were accessed each day between December 2007 and August 2013 and compared these data to official ILI activity levels provided by the Centers for Disease Control and Prevention (CDC). We developed a Poisson model that accurately estimates the level of ILI activity in the American population, up to two weeks ahead of the CDC, with an absolute average difference between the two estimates of just 0.27% over 294 weeks of data. Wikipedia-derived ILI models performed well through both abnormally high media coverage events (such as during the 2009 H1N1 pandemic) as well as unusually severe influenza seasons (such as the 2012-2013 influenza season). Wikipedia usage accurately estimated the week of peak ILI activity 17% more often than Google Flu Trends data and was often more accurate in its measure of ILI intensity. With further study, this method could potentially be implemented for continuous monitoring of ILI activity in the US and to provide support for traditional influenza surveillance tools.https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24743682/pdf/?tool=EBI
collection DOAJ
language English
format Article
sources DOAJ
author David J McIver
John S Brownstein
spellingShingle David J McIver
John S Brownstein
Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time.
PLoS Computational Biology
author_facet David J McIver
John S Brownstein
author_sort David J McIver
title Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time.
title_short Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time.
title_full Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time.
title_fullStr Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time.
title_full_unstemmed Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time.
title_sort wikipedia usage estimates prevalence of influenza-like illness in the united states in near real-time.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2014-04-01
description Circulating levels of both seasonal and pandemic influenza require constant surveillance to ensure the health and safety of the population. While up-to-date information is critical, traditional surveillance systems can have data availability lags of up to two weeks. We introduce a novel method of estimating, in near-real time, the level of influenza-like illness (ILI) in the United States (US) by monitoring the rate of particular Wikipedia article views on a daily basis. We calculated the number of times certain influenza- or health-related Wikipedia articles were accessed each day between December 2007 and August 2013 and compared these data to official ILI activity levels provided by the Centers for Disease Control and Prevention (CDC). We developed a Poisson model that accurately estimates the level of ILI activity in the American population, up to two weeks ahead of the CDC, with an absolute average difference between the two estimates of just 0.27% over 294 weeks of data. Wikipedia-derived ILI models performed well through both abnormally high media coverage events (such as during the 2009 H1N1 pandemic) as well as unusually severe influenza seasons (such as the 2012-2013 influenza season). Wikipedia usage accurately estimated the week of peak ILI activity 17% more often than Google Flu Trends data and was often more accurate in its measure of ILI intensity. With further study, this method could potentially be implemented for continuous monitoring of ILI activity in the US and to provide support for traditional influenza surveillance tools.
url https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24743682/pdf/?tool=EBI
work_keys_str_mv AT davidjmciver wikipediausageestimatesprevalenceofinfluenzalikeillnessintheunitedstatesinnearrealtime
AT johnsbrownstein wikipediausageestimatesprevalenceofinfluenzalikeillnessintheunitedstatesinnearrealtime
_version_ 1714667337090596864