Estimating effective population size from genetic data : the past, present, and the future
Effective population size (Ne) is an important statistic in conservation science and in the broader topics of evolutionary genetics. Ne is often used to quantify the rate of evolutionary events such as losses in genetic diversity. Estimating and interpreting such quantity can however be challenging....
Main Author: | |
---|---|
Other Authors: | |
Published: |
Imperial College London
2017
|
Subjects: | |
Online Access: | https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.721593 |
Summary: | Effective population size (Ne) is an important statistic in conservation science and in the broader topics of evolutionary genetics. Ne is often used to quantify the rate of evolutionary events such as losses in genetic diversity. Estimating and interpreting such quantity can however be challenging. Chapter 2 focuses on the change in allele frequency between two or more time points due to genetic drift. A new likelihood-based estimator N̂_B for contemporary Ne estimation is proposed by adopting a hidden Markov algorithm and continuous approximations. N̂_B is found to be several-fold faster than the existing methods without sacrificing accuracy. It also relaxes the upper bound of Ne to several million and which is currently limited to about 50000 due to computing limitations. Chapter 3 extends N̂_B to handle multialleleic loci through using Dirichlet-multinomial distributions. An R package is also provided and available for download. Chapter 4 explores the signatures of linkage disequilibrium (LD) between a pair of loci induced by genetic drift as a function of recombination rate and historical population sizes. E[r²] can be expressed as the weighted sum of the probability of coalescent at different time points of which information about Ne is contained. This relationship is verified by computer simulation and then applied to historical Ne estimation as illustrated in an example of Anopheles coluzzii population. A new likelihood-based routine Constrained ML is suggested in chapter 5 to estimate haplotype frequencies and r² from genotypes under Hardy-Weinberg Equilibrium. It is shown to be identical to existing EM algorithm under normal conditions but far less sensitive to initial conditions. A new “unbiased” sample size correction is also proposed to estimate r². To summarise, this work pushes the Ne estimation to its current boundary and more importantly provides suitable tools to analyse the ever-growing datasets. |
---|