Stochastic modeling of biological sequence evolution

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005. === This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. === Includes bibliographical ref...

Full description

Bibliographic Details
Main Author: Xu, Keyuan
Other Authors: George C. Verghese and Peter C. Doerschuk.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2006
Subjects:
Online Access:http://hdl.handle.net/1721.1/32113
Description
Summary:Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005. === This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. === Includes bibliographical references (leaves 81-86). === Markov models of sequence evolution are a fundamental building block for making inferences in biological research. This thesis reviews several major techniques developed to estimate parameters of Markov models of sequence evolution and presents a new approach for evaluating and comparing estimation techniques. Current methods for evaluating estimation techniques require sequence data from populations with well-known phylogenetic relationships. Such data is not always available since phylogenetic relationships can never be known with certainty. We propose generating sequence data for the purpose of estimation technique evaluation by simulating sequence evolution in a controlled setting. Our elementary simulator uses a Markov model and a binary branching process, which dynamically builds a phylogenetic tree from an initial seed sequence. The sequences at the leaves of the tree can then be used as input to estimation techniques. We demonstrate our evaluation approach on Arvestad and Bruno's estimation method, and show how our approach can reveal performance variations empirically. The results of our simulation can be used as a guide towards improving estimation techniques. === by Keyuan Xu. === M.Eng.