When are pathogen genome sequences informative of transmission events?

Recent years have seen the development of numerous methodologies for reconstructing transmission trees in infectious disease outbreaks from densely sampled whole genome sequence data. However, a fundamental and as of yet poorly addressed limitation of such approaches is the requirement for genetic d...

Full description

Bibliographic Details
Main Authors: Finlay Campbell, Camilla Strang, Neil Ferguson, Anne Cori, Thibaut Jombart
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2018-02-01
Series:PLoS Pathogens
Online Access:https://doi.org/10.1371/journal.ppat.1006885
id doaj-7275d3c96e8e4c0b8be00463e9a03dd1
record_format Article
spelling doaj-7275d3c96e8e4c0b8be00463e9a03dd12021-04-21T17:54:32ZengPublic Library of Science (PLoS)PLoS Pathogens1553-73661553-73742018-02-01142e100688510.1371/journal.ppat.1006885When are pathogen genome sequences informative of transmission events?Finlay CampbellCamilla StrangNeil FergusonAnne CoriThibaut JombartRecent years have seen the development of numerous methodologies for reconstructing transmission trees in infectious disease outbreaks from densely sampled whole genome sequence data. However, a fundamental and as of yet poorly addressed limitation of such approaches is the requirement for genetic diversity to arise on epidemiological timescales. Specifically, the position of infected individuals in a transmission tree can only be resolved by genetic data if mutations have accumulated between the sampled pathogen genomes. To quantify and compare the useful genetic diversity expected from genetic data in different pathogen outbreaks, we introduce here the concept of 'transmission divergence', defined as the number of mutations separating whole genome sequences sampled from transmission pairs. Using parameter values obtained by literature review, we simulate outbreak scenarios alongside sequence evolution using two models described in the literature to describe transmission divergence of ten major outbreak-causing pathogens. We find that while mean values vary significantly between the pathogens considered, their transmission divergence is generally very low, with many outbreaks characterised by large numbers of genetically identical transmission pairs. We describe the impact of transmission divergence on our ability to reconstruct outbreaks using two outbreak reconstruction tools, the R packages outbreaker and phybreak, and demonstrate that, in agreement with previous observations, genetic sequence data of rapidly evolving pathogens such as RNA viruses can provide valuable information on individual transmission events. Conversely, sequence data of pathogens with lower mean transmission divergence, including Streptococcus pneumoniae, Shigella sonnei and Clostridium difficile, provide little to no information about individual transmission events. Our results highlight the informational limitations of genetic sequence data in certain outbreak scenarios, and demonstrate the need to expand the toolkit of outbreak reconstruction tools to integrate other types of epidemiological data.https://doi.org/10.1371/journal.ppat.1006885
collection DOAJ
language English
format Article
sources DOAJ
author Finlay Campbell
Camilla Strang
Neil Ferguson
Anne Cori
Thibaut Jombart
spellingShingle Finlay Campbell
Camilla Strang
Neil Ferguson
Anne Cori
Thibaut Jombart
When are pathogen genome sequences informative of transmission events?
PLoS Pathogens
author_facet Finlay Campbell
Camilla Strang
Neil Ferguson
Anne Cori
Thibaut Jombart
author_sort Finlay Campbell
title When are pathogen genome sequences informative of transmission events?
title_short When are pathogen genome sequences informative of transmission events?
title_full When are pathogen genome sequences informative of transmission events?
title_fullStr When are pathogen genome sequences informative of transmission events?
title_full_unstemmed When are pathogen genome sequences informative of transmission events?
title_sort when are pathogen genome sequences informative of transmission events?
publisher Public Library of Science (PLoS)
series PLoS Pathogens
issn 1553-7366
1553-7374
publishDate 2018-02-01
description Recent years have seen the development of numerous methodologies for reconstructing transmission trees in infectious disease outbreaks from densely sampled whole genome sequence data. However, a fundamental and as of yet poorly addressed limitation of such approaches is the requirement for genetic diversity to arise on epidemiological timescales. Specifically, the position of infected individuals in a transmission tree can only be resolved by genetic data if mutations have accumulated between the sampled pathogen genomes. To quantify and compare the useful genetic diversity expected from genetic data in different pathogen outbreaks, we introduce here the concept of 'transmission divergence', defined as the number of mutations separating whole genome sequences sampled from transmission pairs. Using parameter values obtained by literature review, we simulate outbreak scenarios alongside sequence evolution using two models described in the literature to describe transmission divergence of ten major outbreak-causing pathogens. We find that while mean values vary significantly between the pathogens considered, their transmission divergence is generally very low, with many outbreaks characterised by large numbers of genetically identical transmission pairs. We describe the impact of transmission divergence on our ability to reconstruct outbreaks using two outbreak reconstruction tools, the R packages outbreaker and phybreak, and demonstrate that, in agreement with previous observations, genetic sequence data of rapidly evolving pathogens such as RNA viruses can provide valuable information on individual transmission events. Conversely, sequence data of pathogens with lower mean transmission divergence, including Streptococcus pneumoniae, Shigella sonnei and Clostridium difficile, provide little to no information about individual transmission events. Our results highlight the informational limitations of genetic sequence data in certain outbreak scenarios, and demonstrate the need to expand the toolkit of outbreak reconstruction tools to integrate other types of epidemiological data.
url https://doi.org/10.1371/journal.ppat.1006885
work_keys_str_mv AT finlaycampbell whenarepathogengenomesequencesinformativeoftransmissionevents
AT camillastrang whenarepathogengenomesequencesinformativeoftransmissionevents
AT neilferguson whenarepathogengenomesequencesinformativeoftransmissionevents
AT annecori whenarepathogengenomesequencesinformativeoftransmissionevents
AT thibautjombart whenarepathogengenomesequencesinformativeoftransmissionevents
_version_ 1714665477882511360