Summary: | Bacterial strains of the same species collected from different hosts frequently exhibit differences in gene content. In the ubiquitous plant pathogen Pseudomonas syringae, more than 30% of genes encoded by each strain are not conserved among strains colonizing other host species. Although they are often implicated in host specificity, the role of this large fraction of the genome in host-specific adaptation is largely unexplored. Here, we sought to relate variation in gene content between strains infecting different species to variation that persists between strains on the same host. We fully sequenced a collection of P. syringae strains collected from wild Arabidopsis thaliana populations in the Midwestern United States. We then compared patterns of variation observed in gene content within these A. thaliana-isolated strains to previously published P. syringae sequence from strains collected on a diversity of crop species. We find that strains collected from the same host, A. thaliana, differ in gene content by 21%, 2/3 the level of gene content variation observed across strains collected from different hosts. Furthermore, the frequency with which specific genes are present among strains collected within the same host and among strains collected from different hosts is highly correlated. This implies that most gene content variation is maintained irrespective of host association. At the same time, we identify specific genes whose presence is important for P. syringae's ability to flourish within A. thaliana. Specifically, the A. thaliana strains uniquely share a genomic island encoding toxins active against plants and surrounding microbes, suggesting a role for microbe-microbe interactions in dictating the abundance within this host. Overall, our results demonstrate that while variation in the presence of specific genes can affect the success of a pathogen within its host, the majority of gene content variation is not strongly associated with patterns of host use.
|