Inference of microbial recombination rates from metagenomic data.

Metagenomic sequencing projects from environments dominated by a small number of species produce genome-wide population samples. We present a two-site composite likelihood estimator of the scaled recombination rate, rho = 2N(e)c, that operates on metagenomic assemblies in which each sequenced fragme...

Full description

Bibliographic Details
Main Authors: Philip L F Johnson, Montgomery Slatkin
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2009-10-01
Series:PLoS Genetics
Online Access:http://europepmc.org/articles/PMC2745702?pdf=render
Description
Summary:Metagenomic sequencing projects from environments dominated by a small number of species produce genome-wide population samples. We present a two-site composite likelihood estimator of the scaled recombination rate, rho = 2N(e)c, that operates on metagenomic assemblies in which each sequenced fragment derives from a different individual. This new estimator properly accounts for sequencing error, as quantified by per-base quality scores, and missing data, as inferred from the placement of reads in a metagenomic assembly. We apply our estimator to data from a sludge metagenome project to demonstrate how this method will elucidate the rates of exchange of genetic material in natural microbial populations. Surprisingly, for a fixed amount of sequencing, this estimator has lower variance than similar methods that operate on more traditional population genetic samples of comparable size. In addition, we can infer variation in recombination rate across the genome because metagenomic projects sample genetic diversity genome-wide, not just at particular loci. The method itself makes no assumption specific to microbial populations, opening the door for application to any mixed population sample where the number of individuals sampled is much greater than the number of fragments sequenced.
ISSN:1553-7390
1553-7404