Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time.
Circulating levels of both seasonal and pandemic influenza require constant surveillance to ensure the health and safety of the population. While up-to-date information is critical, traditional surveillance systems can have data availability lags of up to two weeks. We introduce a novel method of es...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2014-04-01
|
Series: | PLoS Computational Biology |
Online Access: | https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24743682/pdf/?tool=EBI |
id |
doaj-efd53259183141b1bb9e69e63aa62623 |
---|---|
record_format |
Article |
spelling |
doaj-efd53259183141b1bb9e69e63aa626232021-04-21T15:36:04ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582014-04-01104e100358110.1371/journal.pcbi.1003581Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time.David J McIverJohn S BrownsteinCirculating levels of both seasonal and pandemic influenza require constant surveillance to ensure the health and safety of the population. While up-to-date information is critical, traditional surveillance systems can have data availability lags of up to two weeks. We introduce a novel method of estimating, in near-real time, the level of influenza-like illness (ILI) in the United States (US) by monitoring the rate of particular Wikipedia article views on a daily basis. We calculated the number of times certain influenza- or health-related Wikipedia articles were accessed each day between December 2007 and August 2013 and compared these data to official ILI activity levels provided by the Centers for Disease Control and Prevention (CDC). We developed a Poisson model that accurately estimates the level of ILI activity in the American population, up to two weeks ahead of the CDC, with an absolute average difference between the two estimates of just 0.27% over 294 weeks of data. Wikipedia-derived ILI models performed well through both abnormally high media coverage events (such as during the 2009 H1N1 pandemic) as well as unusually severe influenza seasons (such as the 2012-2013 influenza season). Wikipedia usage accurately estimated the week of peak ILI activity 17% more often than Google Flu Trends data and was often more accurate in its measure of ILI intensity. With further study, this method could potentially be implemented for continuous monitoring of ILI activity in the US and to provide support for traditional influenza surveillance tools.https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24743682/pdf/?tool=EBI |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
David J McIver John S Brownstein |
spellingShingle |
David J McIver John S Brownstein Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time. PLoS Computational Biology |
author_facet |
David J McIver John S Brownstein |
author_sort |
David J McIver |
title |
Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time. |
title_short |
Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time. |
title_full |
Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time. |
title_fullStr |
Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time. |
title_full_unstemmed |
Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time. |
title_sort |
wikipedia usage estimates prevalence of influenza-like illness in the united states in near real-time. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS Computational Biology |
issn |
1553-734X 1553-7358 |
publishDate |
2014-04-01 |
description |
Circulating levels of both seasonal and pandemic influenza require constant surveillance to ensure the health and safety of the population. While up-to-date information is critical, traditional surveillance systems can have data availability lags of up to two weeks. We introduce a novel method of estimating, in near-real time, the level of influenza-like illness (ILI) in the United States (US) by monitoring the rate of particular Wikipedia article views on a daily basis. We calculated the number of times certain influenza- or health-related Wikipedia articles were accessed each day between December 2007 and August 2013 and compared these data to official ILI activity levels provided by the Centers for Disease Control and Prevention (CDC). We developed a Poisson model that accurately estimates the level of ILI activity in the American population, up to two weeks ahead of the CDC, with an absolute average difference between the two estimates of just 0.27% over 294 weeks of data. Wikipedia-derived ILI models performed well through both abnormally high media coverage events (such as during the 2009 H1N1 pandemic) as well as unusually severe influenza seasons (such as the 2012-2013 influenza season). Wikipedia usage accurately estimated the week of peak ILI activity 17% more often than Google Flu Trends data and was often more accurate in its measure of ILI intensity. With further study, this method could potentially be implemented for continuous monitoring of ILI activity in the US and to provide support for traditional influenza surveillance tools. |
url |
https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24743682/pdf/?tool=EBI |
work_keys_str_mv |
AT davidjmciver wikipediausageestimatesprevalenceofinfluenzalikeillnessintheunitedstatesinnearrealtime AT johnsbrownstein wikipediausageestimatesprevalenceofinfluenzalikeillnessintheunitedstatesinnearrealtime |
_version_ |
1714667337090596864 |