Computational statistics in molecular phylogenetics
Simulation remains a very important approach to testing the robustness and accuracy of phylogenetic inference methods. However, current simulation programs are limited, especially concerning realistic models for simulating insertions and deletions (indels). In this thesis I implement a new, portable...
Main Author: | |
---|---|
Published: |
University College London (University of London)
2011
|
Subjects: | |
Online Access: | http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.594304 |
id |
ndltd-bl.uk-oai-ethos.bl.uk-594304 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-bl.uk-oai-ethos.bl.uk-5943042015-12-03T03:28:43ZComputational statistics in molecular phylogeneticsFletcher, W. A. J.2011Simulation remains a very important approach to testing the robustness and accuracy of phylogenetic inference methods. However, current simulation programs are limited, especially concerning realistic models for simulating insertions and deletions (indels). In this thesis I implement a new, portable and flexible application, named INDELible, which can be used to generate nucleotide, amino acid and codon sequence data by simulating indels (under several models of indel length distribution) as well as substitutions (under a rich repertoire of substitution models). In particular, I introduce a simulation study that makes use of one of INDELible’s many unique features to simulate data with indels under codon models that allow the nonsynonymous/synonymous substitution rate ratio to vary among sites and branches. This data is used to quantify, for the first time, the precise effects of indels and alignment errors on the false-positive rate and power of the widely used branch-site test of positive selection. Several alignment programs are used and assessed in this context. Through the simulation experiment, I show that insertions and deletions do not cause the test to generate excessive false positives if the alignment is correct, but alignment errors can lead to unacceptably high false positives. Previous selection studies that use inferior alignment programs are revisited to demonstrate the applicability of my results in real world situations. Further work uses simulated data from INDELible to examine the effects of tree-shape and branch length on the alignment accuracy of several alignment programs, and the impact of alignment errors on different methods of phylogeny reconstruction. In particular, analysis is performed to explore which programs avoid generating the kind of alignment errors that are most detrimental to the process of phylogeny reconstruction.570University College London (University of London)http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.594304http://discovery.ucl.ac.uk/1306705/Electronic Thesis or Dissertation |
collection |
NDLTD |
sources |
NDLTD |
topic |
570 |
spellingShingle |
570 Fletcher, W. A. J. Computational statistics in molecular phylogenetics |
description |
Simulation remains a very important approach to testing the robustness and accuracy of phylogenetic inference methods. However, current simulation programs are limited, especially concerning realistic models for simulating insertions and deletions (indels). In this thesis I implement a new, portable and flexible application, named INDELible, which can be used to generate nucleotide, amino acid and codon sequence data by simulating indels (under several models of indel length distribution) as well as substitutions (under a rich repertoire of substitution models). In particular, I introduce a simulation study that makes use of one of INDELible’s many unique features to simulate data with indels under codon models that allow the nonsynonymous/synonymous substitution rate ratio to vary among sites and branches. This data is used to quantify, for the first time, the precise effects of indels and alignment errors on the false-positive rate and power of the widely used branch-site test of positive selection. Several alignment programs are used and assessed in this context. Through the simulation experiment, I show that insertions and deletions do not cause the test to generate excessive false positives if the alignment is correct, but alignment errors can lead to unacceptably high false positives. Previous selection studies that use inferior alignment programs are revisited to demonstrate the applicability of my results in real world situations. Further work uses simulated data from INDELible to examine the effects of tree-shape and branch length on the alignment accuracy of several alignment programs, and the impact of alignment errors on different methods of phylogeny reconstruction. In particular, analysis is performed to explore which programs avoid generating the kind of alignment errors that are most detrimental to the process of phylogeny reconstruction. |
author |
Fletcher, W. A. J. |
author_facet |
Fletcher, W. A. J. |
author_sort |
Fletcher, W. A. J. |
title |
Computational statistics in molecular phylogenetics |
title_short |
Computational statistics in molecular phylogenetics |
title_full |
Computational statistics in molecular phylogenetics |
title_fullStr |
Computational statistics in molecular phylogenetics |
title_full_unstemmed |
Computational statistics in molecular phylogenetics |
title_sort |
computational statistics in molecular phylogenetics |
publisher |
University College London (University of London) |
publishDate |
2011 |
url |
http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.594304 |
work_keys_str_mv |
AT fletcherwaj computationalstatisticsinmolecularphylogenetics |
_version_ |
1718141231627763712 |