Summary: | The accurate prediction of an RNA secondary structure from its sequence will enhance the experimental design and interpretation for the increasing number of scientists that study RNA. While the computer programs that make these predictions have improved, additional improvements are necessary, in particular for larger RNAs. The first major section of this dissertation is concerned with improving the prediction accuracy of RNA secondary structures by generating new energetic parameters and evaluating a new RNA folding model. Statistical potentials for hairpin and internal loops produce significantly higher prediction accuracy when compared with nine other folding programs. While more improvements can be made to the energetic parameters used by secondary structure folding programs, I believe that a new approach is also necessary. I describe a RNA folding model that is predicated on a large body of computational and experimental work. This model includes energetics, contact distance, competition and a folding pathway. Each component of this folding model is evaluated and substantiated for its validity.
The statistical potentials were created with comparative analysis. Comparative analysis requires the creation of highly accurate multiple RNA sequence alignments. The second major section of this dissertation is focused on my template-based sequence aligner, CRWAlign. Multiple sequence aligners generally run into problems when the pairwise sequence identity drops too low. By utilizing multiple dimensions of data to establish a profile for each position in a template alignment, CRWAlign is able to align new sequences with high accuracy even for pairs of sequence with low identity. === text
|