Sparse RNA folding revisited
Background: RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Universitätsbibliothek Leipzig
2016
|
Subjects: | |
Online Access: | http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163 http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163 http://www.qucosa.de/fileadmin/data/qucosa/documents/20416/OAP-2016-069_Will_art_10.1186_s13015-016-0071-y.pdf |
id |
ndltd-DRESDEN-oai-qucosa.de-bsz-15-qucosa-204163 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-DRESDEN-oai-qucosa.de-bsz-15-qucosa-2041632016-06-10T03:27:36Z Sparse RNA folding revisited Will, Sebastian Jabbari, Hosna Raumeinsparung Pseudoknoten-RNA RNA-Sekundärstruktur Space efficient sparsification Pseudoknot-free RNA folding RNA secondary structure prediction ddc:004 ddc:570 Background: RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while guaranteeing the same result; however, for many such folding problems, space efficiency is of even greater concern, particularly for long RNA sequences. So far, spaceefficient sparsified RNA folding with fold reconstruction was solved only for simple base-pair-based pseudo-energy models. Results: Here, we revisit the problem of space-efficient free energy minimization. Whereas the space-efficient minimization of the free energy has been sketched before, the reconstruction of the optimum structure has not even been discussed. We show that this reconstruction is not possible in trivial extension of the method for simple energy models. Then, we present the time- and space-efficient sparsified free energy minimization algorithm SparseMFEFold that guarantees MFE structure prediction. In particular, this novel algorithm provides efficient fold reconstruction based on dynamically garbage-collected trace arrows. The complexity of our algorithm depends on two parameters, the number of candidates Z and the number of trace arrows T; both are bounded by n2, but are typically much smaller. The time complexity of RNA folding is reduced from O(n3) to O(n2 + nZ); the space complexity, from O(n2) to O(n + T + Z). Our empirical results show more than 80 % space savings over RNAfold [Vienna RNA package] on the long RNAs from the RNA STRAND database (≥2500 bases). Conclusions: The presented technique is intentionally generalizable to complex prediction algorithms; due to their high space demands, algorithms like pseudoknot prediction and RNA–RNA-interaction prediction are expected to profit even stronger than \"standard\" MFE folding. SparseMFEFold is free software, available at http://www.bioinf.unileipzig. de/~will/Software/SparseMFEFold. Universitätsbibliothek Leipzig Universität Leipzig, Interdisziplinäres Zentrum für Bioinformatik BioMed Central, 2016-06-09 doc-type:article application/pdf http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163 urn:nbn:de:bsz:15-qucosa-204163 issn:1748-7188 http://www.qucosa.de/fileadmin/data/qucosa/documents/20416/OAP-2016-069_Will_art_10.1186_s13015-016-0071-y.pdf Algorithms Mol Biol (2016) 11:7 DOI 10.1186/s13015-016-0071-y eng |
collection |
NDLTD |
language |
English |
format |
Article |
sources |
NDLTD |
topic |
Raumeinsparung Pseudoknoten-RNA RNA-Sekundärstruktur Space efficient sparsification Pseudoknot-free RNA folding RNA secondary structure prediction ddc:004 ddc:570 |
spellingShingle |
Raumeinsparung Pseudoknoten-RNA RNA-Sekundärstruktur Space efficient sparsification Pseudoknot-free RNA folding RNA secondary structure prediction ddc:004 ddc:570 Will, Sebastian Jabbari, Hosna Sparse RNA folding revisited |
description |
Background: RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while guaranteeing the same result; however, for many such folding problems, space efficiency is of even greater concern, particularly for long RNA sequences. So far, spaceefficient sparsified RNA folding with fold reconstruction was solved only for simple base-pair-based pseudo-energy models. Results: Here, we revisit the problem of space-efficient free energy minimization. Whereas the space-efficient minimization of the free energy has been sketched before, the reconstruction of the optimum structure has not even been discussed. We show that this reconstruction is not possible in trivial extension of the method for simple energy models. Then, we present the time- and space-efficient sparsified free energy minimization algorithm SparseMFEFold that guarantees MFE structure prediction. In particular, this novel algorithm provides efficient fold reconstruction based on dynamically garbage-collected trace arrows. The complexity of our algorithm depends on two parameters, the number of candidates Z and the number of trace arrows T; both are bounded by n2, but are typically much smaller. The time complexity of RNA folding is reduced from O(n3) to O(n2 + nZ); the space complexity, from O(n2) to O(n + T + Z). Our empirical results show more than 80 % space savings over RNAfold [Vienna RNA package] on the long RNAs from the RNA STRAND database (≥2500 bases). Conclusions: The presented technique is intentionally generalizable to complex prediction algorithms; due to their high space demands, algorithms like pseudoknot prediction and RNA–RNA-interaction prediction are expected to profit even stronger than \"standard\" MFE folding. SparseMFEFold is free software, available at http://www.bioinf.unileipzig. de/~will/Software/SparseMFEFold. |
author2 |
Universität Leipzig, Interdisziplinäres Zentrum für Bioinformatik |
author_facet |
Universität Leipzig, Interdisziplinäres Zentrum für Bioinformatik Will, Sebastian Jabbari, Hosna |
author |
Will, Sebastian Jabbari, Hosna |
author_sort |
Will, Sebastian |
title |
Sparse RNA folding revisited |
title_short |
Sparse RNA folding revisited |
title_full |
Sparse RNA folding revisited |
title_fullStr |
Sparse RNA folding revisited |
title_full_unstemmed |
Sparse RNA folding revisited |
title_sort |
sparse rna folding revisited |
publisher |
Universitätsbibliothek Leipzig |
publishDate |
2016 |
url |
http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163 http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163 http://www.qucosa.de/fileadmin/data/qucosa/documents/20416/OAP-2016-069_Will_art_10.1186_s13015-016-0071-y.pdf |
work_keys_str_mv |
AT willsebastian sparsernafoldingrevisited AT jabbarihosna sparsernafoldingrevisited |
_version_ |
1718298798142259200 |