Inference of past demography, dormancy and self-fertilization rates from whole genome sequence data.

Several methods based on the Sequential Markovian coalescence (SMC) have been developed that make use of genome sequence data to uncover population demographic history, which is of interest in its own right and is a key requirement to generate a null model for selection tests. While these methods ca...

Full description

Bibliographic Details
Main Authors: Thibaut Paul Patrick Sellinger, Diala Abu Awad, Markus Moest, Aurélien Tellier
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-04-01
Series:PLoS Genetics
Online Access:https://doi.org/10.1371/journal.pgen.1008698
id doaj-a305d3c76c324ec590358091481e3593
record_format Article
spelling doaj-a305d3c76c324ec590358091481e35932021-04-21T14:37:26ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042020-04-01164e100869810.1371/journal.pgen.1008698Inference of past demography, dormancy and self-fertilization rates from whole genome sequence data.Thibaut Paul Patrick SellingerDiala Abu AwadMarkus MoestAurélien TellierSeveral methods based on the Sequential Markovian coalescence (SMC) have been developed that make use of genome sequence data to uncover population demographic history, which is of interest in its own right and is a key requirement to generate a null model for selection tests. While these methods can be applied to all possible kind of species, the underlying assumptions are sexual reproduction in each generation and non-overlapping generations. However, in many plants, invertebrates, fungi and other taxa, those assumptions are often violated due to different ecological and life history traits, such as self-fertilization or long term dormant structures (seed or egg-banking). We develop a novel SMC-based method to infer 1) the rates/parameters of dormancy and of self-fertilization, and 2) the populations' past demographic history. Using simulated data sets, we demonstrate the accuracy of our method for a wide range of demographic scenarios and for sequence lengths from one to 30 Mb using four sampled genomes. Finally, we apply our method to a Swedish and a German population of Arabidopsis thaliana demonstrating a selfing rate of ca. 0.87 and the absence of any detectable seed-bank. In contrast, we show that the water flea Daphnia pulex exhibits a long lived egg-bank of three to 18 generations. In conclusion, we here present a novel method to infer accurate demographies and life-history traits for species with selfing and/or seed/egg-banks. Finally, we provide recommendations for the use of SMC-based methods for non-model organisms, highlighting the importance of the per site and the effective ratios of recombination over mutation.https://doi.org/10.1371/journal.pgen.1008698
collection DOAJ
language English
format Article
sources DOAJ
author Thibaut Paul Patrick Sellinger
Diala Abu Awad
Markus Moest
Aurélien Tellier
spellingShingle Thibaut Paul Patrick Sellinger
Diala Abu Awad
Markus Moest
Aurélien Tellier
Inference of past demography, dormancy and self-fertilization rates from whole genome sequence data.
PLoS Genetics
author_facet Thibaut Paul Patrick Sellinger
Diala Abu Awad
Markus Moest
Aurélien Tellier
author_sort Thibaut Paul Patrick Sellinger
title Inference of past demography, dormancy and self-fertilization rates from whole genome sequence data.
title_short Inference of past demography, dormancy and self-fertilization rates from whole genome sequence data.
title_full Inference of past demography, dormancy and self-fertilization rates from whole genome sequence data.
title_fullStr Inference of past demography, dormancy and self-fertilization rates from whole genome sequence data.
title_full_unstemmed Inference of past demography, dormancy and self-fertilization rates from whole genome sequence data.
title_sort inference of past demography, dormancy and self-fertilization rates from whole genome sequence data.
publisher Public Library of Science (PLoS)
series PLoS Genetics
issn 1553-7390
1553-7404
publishDate 2020-04-01
description Several methods based on the Sequential Markovian coalescence (SMC) have been developed that make use of genome sequence data to uncover population demographic history, which is of interest in its own right and is a key requirement to generate a null model for selection tests. While these methods can be applied to all possible kind of species, the underlying assumptions are sexual reproduction in each generation and non-overlapping generations. However, in many plants, invertebrates, fungi and other taxa, those assumptions are often violated due to different ecological and life history traits, such as self-fertilization or long term dormant structures (seed or egg-banking). We develop a novel SMC-based method to infer 1) the rates/parameters of dormancy and of self-fertilization, and 2) the populations' past demographic history. Using simulated data sets, we demonstrate the accuracy of our method for a wide range of demographic scenarios and for sequence lengths from one to 30 Mb using four sampled genomes. Finally, we apply our method to a Swedish and a German population of Arabidopsis thaliana demonstrating a selfing rate of ca. 0.87 and the absence of any detectable seed-bank. In contrast, we show that the water flea Daphnia pulex exhibits a long lived egg-bank of three to 18 generations. In conclusion, we here present a novel method to infer accurate demographies and life-history traits for species with selfing and/or seed/egg-banks. Finally, we provide recommendations for the use of SMC-based methods for non-model organisms, highlighting the importance of the per site and the effective ratios of recombination over mutation.
url https://doi.org/10.1371/journal.pgen.1008698
work_keys_str_mv AT thibautpaulpatricksellinger inferenceofpastdemographydormancyandselffertilizationratesfromwholegenomesequencedata
AT dialaabuawad inferenceofpastdemographydormancyandselffertilizationratesfromwholegenomesequencedata
AT markusmoest inferenceofpastdemographydormancyandselffertilizationratesfromwholegenomesequencedata
AT aurelientellier inferenceofpastdemographydormancyandselffertilizationratesfromwholegenomesequencedata
_version_ 1714668184099880960