Confronting data sparsity to identify potential sources of Zika virus spillover infection among primates

The recent Zika virus (ZIKV) epidemic in the Americas ranks among the largest outbreaks in modern times. Like other mosquito-borne flaviviruses, ZIKV circulates in sylvatic cycles among primates that can serve as reservoirs of spillover infection to humans. Identifying sylvatic reservoirs is critica...

Full description

Bibliographic Details
Main Authors: Barbara A. Han, Subhabrata Majumdar, Flavio P. Calmon, Benjamin S. Glicksberg, Raya Horesh, Abhishek Kumar, Adam Perer, Elisa B. von Marschall, Dennis Wei, Aleksandra Mojsilović, Kush R. Varshney
Format: Article
Language:English
Published: Elsevier 2019-06-01
Series:Epidemics
Online Access:http://www.sciencedirect.com/science/article/pii/S1755436518301531
id doaj-c36a9d2dd3a64a5c87d96495c6260860
record_format Article
spelling doaj-c36a9d2dd3a64a5c87d96495c62608602020-11-25T01:18:38ZengElsevierEpidemics1755-43652019-06-01275965Confronting data sparsity to identify potential sources of Zika virus spillover infection among primatesBarbara A. Han0Subhabrata Majumdar1Flavio P. Calmon2Benjamin S. Glicksberg3Raya Horesh4Abhishek Kumar5Adam Perer6Elisa B. von Marschall7Dennis Wei8Aleksandra Mojsilović9Kush R. Varshney10Cary Institute of Ecosystem Studies, Box AB Millbrook, NY 12545, USA; Corresponding author.University of Florida Informatics Institute, 432 Newell Drive, CISE Bldg E251, Gainesville, FL 32611, USAHarvard University, 29 Oxford St, Cambridge, MA 02138, USABakar Computational Health Sciences Institute, University of California, San Francisco, CA, 94158, USAIBM Research, 1101 Kitchawan Rd, Yorktown Heights, NY 10598, USACary Institute of Ecosystem Studies, Box AB Millbrook, NY 12545, USA; University of Florida Informatics Institute, 432 Newell Drive, CISE Bldg E251, Gainesville, FL 32611, USA; Harvard University, 29 Oxford St, Cambridge, MA 02138, USA; Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, 94158, USA; IBM Research, 1101 Kitchawan Rd, Yorktown Heights, NY 10598, USA; Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USA; IBM Watson Media & Weather, 550 Assembly St, Columbia, SC 29201, USACarnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USAIBM Watson Media & Weather, 550 Assembly St, Columbia, SC 29201, USAIBM Research, 1101 Kitchawan Rd, Yorktown Heights, NY 10598, USAIBM Research, 1101 Kitchawan Rd, Yorktown Heights, NY 10598, USAIBM Research, 1101 Kitchawan Rd, Yorktown Heights, NY 10598, USAThe recent Zika virus (ZIKV) epidemic in the Americas ranks among the largest outbreaks in modern times. Like other mosquito-borne flaviviruses, ZIKV circulates in sylvatic cycles among primates that can serve as reservoirs of spillover infection to humans. Identifying sylvatic reservoirs is critical to mitigating spillover risk, but relevant surveillance and biological data remain limited for this and most other zoonoses. We confronted this data sparsity by combining a machine learning method, Bayesian multi-label learning, with a multiple imputation method on primate traits. The resulting models distinguished flavivirus-positive primates with 82% accuracy and suggest that species posing the greatest spillover risk are also among the best adapted to human habitations. Given pervasive data sparsity describing animal hosts, and the virtual guarantee of data sparsity in scenarios involving novel or emerging zoonoses, we show that computational methods can be useful in extracting actionable inference from available data to support improved epidemiological response and prevention. Keywords: Predictive analytics, Flavivirus, Arbovirus, Non-human primate, Machine learning, Bayesian multi-task learning, Imputation, Neotropical, Spillover, Spillback, Ecology, Surveillancehttp://www.sciencedirect.com/science/article/pii/S1755436518301531
collection DOAJ
language English
format Article
sources DOAJ
author Barbara A. Han
Subhabrata Majumdar
Flavio P. Calmon
Benjamin S. Glicksberg
Raya Horesh
Abhishek Kumar
Adam Perer
Elisa B. von Marschall
Dennis Wei
Aleksandra Mojsilović
Kush R. Varshney
spellingShingle Barbara A. Han
Subhabrata Majumdar
Flavio P. Calmon
Benjamin S. Glicksberg
Raya Horesh
Abhishek Kumar
Adam Perer
Elisa B. von Marschall
Dennis Wei
Aleksandra Mojsilović
Kush R. Varshney
Confronting data sparsity to identify potential sources of Zika virus spillover infection among primates
Epidemics
author_facet Barbara A. Han
Subhabrata Majumdar
Flavio P. Calmon
Benjamin S. Glicksberg
Raya Horesh
Abhishek Kumar
Adam Perer
Elisa B. von Marschall
Dennis Wei
Aleksandra Mojsilović
Kush R. Varshney
author_sort Barbara A. Han
title Confronting data sparsity to identify potential sources of Zika virus spillover infection among primates
title_short Confronting data sparsity to identify potential sources of Zika virus spillover infection among primates
title_full Confronting data sparsity to identify potential sources of Zika virus spillover infection among primates
title_fullStr Confronting data sparsity to identify potential sources of Zika virus spillover infection among primates
title_full_unstemmed Confronting data sparsity to identify potential sources of Zika virus spillover infection among primates
title_sort confronting data sparsity to identify potential sources of zika virus spillover infection among primates
publisher Elsevier
series Epidemics
issn 1755-4365
publishDate 2019-06-01
description The recent Zika virus (ZIKV) epidemic in the Americas ranks among the largest outbreaks in modern times. Like other mosquito-borne flaviviruses, ZIKV circulates in sylvatic cycles among primates that can serve as reservoirs of spillover infection to humans. Identifying sylvatic reservoirs is critical to mitigating spillover risk, but relevant surveillance and biological data remain limited for this and most other zoonoses. We confronted this data sparsity by combining a machine learning method, Bayesian multi-label learning, with a multiple imputation method on primate traits. The resulting models distinguished flavivirus-positive primates with 82% accuracy and suggest that species posing the greatest spillover risk are also among the best adapted to human habitations. Given pervasive data sparsity describing animal hosts, and the virtual guarantee of data sparsity in scenarios involving novel or emerging zoonoses, we show that computational methods can be useful in extracting actionable inference from available data to support improved epidemiological response and prevention. Keywords: Predictive analytics, Flavivirus, Arbovirus, Non-human primate, Machine learning, Bayesian multi-task learning, Imputation, Neotropical, Spillover, Spillback, Ecology, Surveillance
url http://www.sciencedirect.com/science/article/pii/S1755436518301531
work_keys_str_mv AT barbaraahan confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
AT subhabratamajumdar confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
AT flaviopcalmon confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
AT benjaminsglicksberg confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
AT rayahoresh confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
AT abhishekkumar confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
AT adamperer confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
AT elisabvonmarschall confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
AT denniswei confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
AT aleksandramojsilovic confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
AT kushrvarshney confrontingdatasparsitytoidentifypotentialsourcesofzikavirusspilloverinfectionamongprimates
_version_ 1725141500719792128