Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands

Secondary structure prediction of nucleic acid molecules is a very important problem in computational molecular biology. In this thesis we introduce two new algorithms for: (1) secondary structure prediction of pairs of nucleic acid molecules (PairFold), and (2) finding which sequences, formed from...

Full description

Bibliographic Details
Main Author: Andronescu, Mirela Ştefania
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/14551
id ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-14551
record_format oai_dc
spelling ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-145512014-03-14T15:47:38Z Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands Andronescu, Mirela Ştefania Secondary structure prediction of nucleic acid molecules is a very important problem in computational molecular biology. In this thesis we introduce two new algorithms for: (1) secondary structure prediction of pairs of nucleic acid molecules (PairFold), and (2) finding which sequences, formed from a combinatorial set of nucleic acid strands, have the most stable secondary structures (CombFold). Our algorithms run in polynomial time in the sequences lengths and are extensions of the free energy minimization algorithm [72] for secondary structure prediction without pseudoknots, using the nearest neighbour thermodynamic model. Predicting hybridization of pairs of molecules is motivated by important applications such as ribozyme - mRNA target duplexes, primer binding prediction and DNA code design. Finding the most stable concatenations in combinatorial sets of strands is useful for SELEX experiments and for testing whether sets in DNA computing or tag libraries concatenate without secondary structure. Our results for PairFold predictions show over 80% accuracy for sequences of up to 100 nucleotides. The performance goes down as the sequences increase in length and as the number of non-canonical base pairs, pseudoknots and tertiary interactions, none of these considered here, increases. The accuracy of CombFold is similar to that of the free energy minimization algorithm for single strands, being just a polynomial method for structure prediction of a combinatorial set of strands. We show that although complex, CombFold can quickly predict large concatenations of sets drawn from the literature. In the future, these two algorithms can be combined to predict the most stable duplexes formed by two combinatorial sets. 2009-11-02T20:24:53Z 2009-11-02T20:24:53Z 2003 2009-11-02T20:24:53Z 2003-11 Electronic Thesis or Dissertation http://hdl.handle.net/2429/14551 eng UBC Retrospective Theses Digitization Project [http://www.library.ubc.ca/archives/retro_theses/]
collection NDLTD
language English
sources NDLTD
description Secondary structure prediction of nucleic acid molecules is a very important problem in computational molecular biology. In this thesis we introduce two new algorithms for: (1) secondary structure prediction of pairs of nucleic acid molecules (PairFold), and (2) finding which sequences, formed from a combinatorial set of nucleic acid strands, have the most stable secondary structures (CombFold). Our algorithms run in polynomial time in the sequences lengths and are extensions of the free energy minimization algorithm [72] for secondary structure prediction without pseudoknots, using the nearest neighbour thermodynamic model. Predicting hybridization of pairs of molecules is motivated by important applications such as ribozyme - mRNA target duplexes, primer binding prediction and DNA code design. Finding the most stable concatenations in combinatorial sets of strands is useful for SELEX experiments and for testing whether sets in DNA computing or tag libraries concatenate without secondary structure. Our results for PairFold predictions show over 80% accuracy for sequences of up to 100 nucleotides. The performance goes down as the sequences increase in length and as the number of non-canonical base pairs, pseudoknots and tertiary interactions, none of these considered here, increases. The accuracy of CombFold is similar to that of the free energy minimization algorithm for single strands, being just a polynomial method for structure prediction of a combinatorial set of strands. We show that although complex, CombFold can quickly predict large concatenations of sets drawn from the literature. In the future, these two algorithms can be combined to predict the most stable duplexes formed by two combinatorial sets.
author Andronescu, Mirela Ştefania
spellingShingle Andronescu, Mirela Ştefania
Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands
author_facet Andronescu, Mirela Ştefania
author_sort Andronescu, Mirela Ştefania
title Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands
title_short Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands
title_full Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands
title_fullStr Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands
title_full_unstemmed Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands
title_sort algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands
publishDate 2009
url http://hdl.handle.net/2429/14551
work_keys_str_mv AT andronescumirelastefania algorithmsforpredictingthesecondarystructureofpairsandcombinatorialsetsofnucleicacidstrands
_version_ 1716653048360599552