Summary: | This paper presents a novel population genetic model and a computationally and statistically tractable framework for analyzing within-host HIV diversity based on serial samples of HIV DNA sequences. This model considers within-host HIV evolution during the chronic phase of infection and assumes that the HIV population is homogeneous at the beginning, corresponding to the time of seroconversion, and evolves according to the Wright-Fisher reproduction model with recombination and variable mutation rate across nucleotide sites. In addition, the population size and generation time vary over time as piecewise constant functions of time. Under this model I approximate the genealogical and mutational processes for serial samples of DNA sequences by a continuous coalescent-recombination process and an inhomogeneous Poisson process, respectively. Based on these derivations, an efficient algorithm is described for generating polymorphisms in serial samples of DNA sequences under the model including various substitution models. Extensions of the algorithm are also described for other demographic scenarios that can be more suitable for analyzing the dynamics of genetic diversity of other pathogens in vitro and in vivo. For the case of the infinite-sites model, I derive analytical formulas for the expected number of polymorphic sites in sample of DNA sequences, and apply the developed simulation and analytical methods to explore the fit of the model to HIV genetic diversity based on serial samples of HIV DNA sequences from 9 HIV-infected individuals. The results particularly show that the estimates of the ratio of recombination rate over mutation rate can vary over time between very high and low values, which can be considered as a consequence of the impact of selection forces.
|