Summary: | This thesis studies the evolution, influence, and proliferation of endogenous retroviruses (ERVs) within animal genomes. First, a simple mathematical model is constructed to address the question of whether retroviral endogenizations occur most often in male or female hosts. The result of applying the model to a diversity of genomes suggests that there may be female risk factors to endogenization, or that selection may be acting on full-length ERVs. Second, a study of the divergence of orthologous full-length ERVs from human and chimpanzee is performed. It is found that highly transcribed members of the HERV-H family have been under directional selection in the last six million years. Third, the insertion and deletion activity of the largest ERV families in five primate species is studied. Using a phylogenetic model it is demonstrated that ERVs are likely to be deleted early if they are to be deleted at all. Notably, it is also shown that HERV-H is an outlier family that is unusually slowly deleted. Fourth, the HERV-H loci in the human genome are studied on an individual basis. It is found that the long terminal repeats of HERV-H affect the magnitude and specificity of its transcription. Surprisingly, a region of the retroviral gag gene is positively associated with transcription and it is argued that this association is a partial explanation for the preferential maintenance of HERV-H in a full-length form. In conclusion, it is argued that researchers should take seriously the notion that many ERVs have not drifted to fixation. It is also argued that taking account of solo-LTR formation is important to accurately assessing the historical activity of ERVs. Finally, it is hypothesised that the application of bioinformatics techniques like those developed in this thesis may be sufficient to identify exaptation events in species quite distant from the primates that are studied here.
|