Sparse RNA folding revisited

Background: RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while...

Full description

Bibliographic Details
Main Authors: Will, Sebastian, Jabbari, Hosna
Other Authors: Universität Leipzig, Interdisziplinäres Zentrum für Bioinformatik
Format: Article
Language:English
Published: Universitätsbibliothek Leipzig 2016
Subjects:
Online Access:http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163
http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163
http://www.qucosa.de/fileadmin/data/qucosa/documents/20416/OAP-2016-069_Will_art_10.1186_s13015-016-0071-y.pdf
id ndltd-DRESDEN-oai-qucosa.de-bsz-15-qucosa-204163
record_format oai_dc
spelling ndltd-DRESDEN-oai-qucosa.de-bsz-15-qucosa-2041632016-06-10T03:27:36Z Sparse RNA folding revisited Will, Sebastian Jabbari, Hosna Raumeinsparung Pseudoknoten-RNA RNA-Sekundärstruktur Space efficient sparsification Pseudoknot-free RNA folding RNA secondary structure prediction ddc:004 ddc:570 Background: RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while guaranteeing the same result; however, for many such folding problems, space efficiency is of even greater concern, particularly for long RNA sequences. So far, spaceefficient sparsified RNA folding with fold reconstruction was solved only for simple base-pair-based pseudo-energy models. Results: Here, we revisit the problem of space-efficient free energy minimization. Whereas the space-efficient minimization of the free energy has been sketched before, the reconstruction of the optimum structure has not even been discussed. We show that this reconstruction is not possible in trivial extension of the method for simple energy models. Then, we present the time- and space-efficient sparsified free energy minimization algorithm SparseMFEFold that guarantees MFE structure prediction. In particular, this novel algorithm provides efficient fold reconstruction based on dynamically garbage-collected trace arrows. The complexity of our algorithm depends on two parameters, the number of candidates Z and the number of trace arrows T; both are bounded by n2, but are typically much smaller. The time complexity of RNA folding is reduced from O(n3) to O(n2 + nZ); the space complexity, from O(n2) to O(n + T + Z). Our empirical results show more than 80 % space savings over RNAfold [Vienna RNA package] on the long RNAs from the RNA STRAND database (≥2500 bases). Conclusions: The presented technique is intentionally generalizable to complex prediction algorithms; due to their high space demands, algorithms like pseudoknot prediction and RNA–RNA-interaction prediction are expected to profit even stronger than \"standard\" MFE folding. SparseMFEFold is free software, available at http://www.bioinf.unileipzig. de/~will/Software/SparseMFEFold. Universitätsbibliothek Leipzig Universität Leipzig, Interdisziplinäres Zentrum für Bioinformatik BioMed Central, 2016-06-09 doc-type:article application/pdf http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163 urn:nbn:de:bsz:15-qucosa-204163 issn:1748-7188 http://www.qucosa.de/fileadmin/data/qucosa/documents/20416/OAP-2016-069_Will_art_10.1186_s13015-016-0071-y.pdf Algorithms Mol Biol (2016) 11:7 DOI 10.1186/s13015-016-0071-y eng
collection NDLTD
language English
format Article
sources NDLTD
topic Raumeinsparung
Pseudoknoten-RNA
RNA-Sekundärstruktur
Space efficient sparsification
Pseudoknot-free RNA folding
RNA secondary structure prediction
ddc:004
ddc:570
spellingShingle Raumeinsparung
Pseudoknoten-RNA
RNA-Sekundärstruktur
Space efficient sparsification
Pseudoknot-free RNA folding
RNA secondary structure prediction
ddc:004
ddc:570
Will, Sebastian
Jabbari, Hosna
Sparse RNA folding revisited
description Background: RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while guaranteeing the same result; however, for many such folding problems, space efficiency is of even greater concern, particularly for long RNA sequences. So far, spaceefficient sparsified RNA folding with fold reconstruction was solved only for simple base-pair-based pseudo-energy models. Results: Here, we revisit the problem of space-efficient free energy minimization. Whereas the space-efficient minimization of the free energy has been sketched before, the reconstruction of the optimum structure has not even been discussed. We show that this reconstruction is not possible in trivial extension of the method for simple energy models. Then, we present the time- and space-efficient sparsified free energy minimization algorithm SparseMFEFold that guarantees MFE structure prediction. In particular, this novel algorithm provides efficient fold reconstruction based on dynamically garbage-collected trace arrows. The complexity of our algorithm depends on two parameters, the number of candidates Z and the number of trace arrows T; both are bounded by n2, but are typically much smaller. The time complexity of RNA folding is reduced from O(n3) to O(n2 + nZ); the space complexity, from O(n2) to O(n + T + Z). Our empirical results show more than 80 % space savings over RNAfold [Vienna RNA package] on the long RNAs from the RNA STRAND database (≥2500 bases). Conclusions: The presented technique is intentionally generalizable to complex prediction algorithms; due to their high space demands, algorithms like pseudoknot prediction and RNA–RNA-interaction prediction are expected to profit even stronger than \"standard\" MFE folding. SparseMFEFold is free software, available at http://www.bioinf.unileipzig. de/~will/Software/SparseMFEFold.
author2 Universität Leipzig, Interdisziplinäres Zentrum für Bioinformatik
author_facet Universität Leipzig, Interdisziplinäres Zentrum für Bioinformatik
Will, Sebastian
Jabbari, Hosna
author Will, Sebastian
Jabbari, Hosna
author_sort Will, Sebastian
title Sparse RNA folding revisited
title_short Sparse RNA folding revisited
title_full Sparse RNA folding revisited
title_fullStr Sparse RNA folding revisited
title_full_unstemmed Sparse RNA folding revisited
title_sort sparse rna folding revisited
publisher Universitätsbibliothek Leipzig
publishDate 2016
url http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163
http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163
http://www.qucosa.de/fileadmin/data/qucosa/documents/20416/OAP-2016-069_Will_art_10.1186_s13015-016-0071-y.pdf
work_keys_str_mv AT willsebastian sparsernafoldingrevisited
AT jabbarihosna sparsernafoldingrevisited
_version_ 1718298798142259200