Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks.
Whole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and tr...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2017-05-01
|
Series: | PLoS Computational Biology |
Online Access: | https://doi.org/10.1371/journal.pcbi.1005495 |
id |
doaj-15a8173015f6455fa587bee9083e1670 |
---|---|
record_format |
Article |
spelling |
doaj-15a8173015f6455fa587bee9083e16702021-04-21T15:43:07ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582017-05-01135e100549510.1371/journal.pcbi.1005495Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks.Don KlinkenbergJantien A BackerXavier DidelotCaroline ColijnJacco WallingaWhole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and transmission. However, the relationship between transmission events and sequence data is obscured by uncertainty arising from four largely unobserved processes: transmission, case observation, within-host pathogen dynamics and mutation. To properly resolve transmission events, these processes need to be taken into account. Recent years have seen much progress in theory and method development, but existing applications make simplifying assumptions that often break up the dependency between the four processes, or are tailored to specific datasets with matching model assumptions and code. To obtain a method with wider applicability, we have developed a novel approach to reconstruct transmission trees with sequence data. Our approach combines elementary models for transmission, case observation, within-host pathogen dynamics, and mutation, under the assumption that the outbreak is over and all cases have been observed. We use Bayesian inference with MCMC for which we have designed novel proposal steps to efficiently traverse the posterior distribution, taking account of all unobserved processes at once. This allows for efficient sampling of transmission trees from the posterior distribution, and robust estimation of consensus transmission trees. We implemented the proposed method in a new R package phybreak. The method performs well in tests of both new and published simulated data. We apply the model to five datasets on densely sampled infectious disease outbreaks, covering a wide range of epidemiological settings. Using only sampling times and sequences as data, our analyses confirmed the original results or improved on them: the more realistic infection times place more confidence in the inferred transmission trees.https://doi.org/10.1371/journal.pcbi.1005495 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Don Klinkenberg Jantien A Backer Xavier Didelot Caroline Colijn Jacco Wallinga |
spellingShingle |
Don Klinkenberg Jantien A Backer Xavier Didelot Caroline Colijn Jacco Wallinga Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks. PLoS Computational Biology |
author_facet |
Don Klinkenberg Jantien A Backer Xavier Didelot Caroline Colijn Jacco Wallinga |
author_sort |
Don Klinkenberg |
title |
Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks. |
title_short |
Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks. |
title_full |
Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks. |
title_fullStr |
Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks. |
title_full_unstemmed |
Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks. |
title_sort |
simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS Computational Biology |
issn |
1553-734X 1553-7358 |
publishDate |
2017-05-01 |
description |
Whole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and transmission. However, the relationship between transmission events and sequence data is obscured by uncertainty arising from four largely unobserved processes: transmission, case observation, within-host pathogen dynamics and mutation. To properly resolve transmission events, these processes need to be taken into account. Recent years have seen much progress in theory and method development, but existing applications make simplifying assumptions that often break up the dependency between the four processes, or are tailored to specific datasets with matching model assumptions and code. To obtain a method with wider applicability, we have developed a novel approach to reconstruct transmission trees with sequence data. Our approach combines elementary models for transmission, case observation, within-host pathogen dynamics, and mutation, under the assumption that the outbreak is over and all cases have been observed. We use Bayesian inference with MCMC for which we have designed novel proposal steps to efficiently traverse the posterior distribution, taking account of all unobserved processes at once. This allows for efficient sampling of transmission trees from the posterior distribution, and robust estimation of consensus transmission trees. We implemented the proposed method in a new R package phybreak. The method performs well in tests of both new and published simulated data. We apply the model to five datasets on densely sampled infectious disease outbreaks, covering a wide range of epidemiological settings. Using only sampling times and sequences as data, our analyses confirmed the original results or improved on them: the more realistic infection times place more confidence in the inferred transmission trees. |
url |
https://doi.org/10.1371/journal.pcbi.1005495 |
work_keys_str_mv |
AT donklinkenberg simultaneousinferenceofphylogeneticandtransmissiontreesininfectiousdiseaseoutbreaks AT jantienabacker simultaneousinferenceofphylogeneticandtransmissiontreesininfectiousdiseaseoutbreaks AT xavierdidelot simultaneousinferenceofphylogeneticandtransmissiontreesininfectiousdiseaseoutbreaks AT carolinecolijn simultaneousinferenceofphylogeneticandtransmissiontreesininfectiousdiseaseoutbreaks AT jaccowallinga simultaneousinferenceofphylogeneticandtransmissiontreesininfectiousdiseaseoutbreaks |
_version_ |
1714667104402145280 |