Prediction of RNA Pseudoknotted Secondary Structure using Stochastic Context Free Grammars (SCFG)1
Pseudoknots are a frequent RNA structure that assumes essential roles for varied biocatalyst cell’s functions. One of the most challenging fields in bioinformatics is the prediction of this secondary structure based on the base-pair sequence that dictates it. Previously, a model adapted from...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Centro Latinoamericano de Estudios en Informática
2006-12-01
|
Series: | CLEI Electronic Journal |
Subjects: | |
Online Access: | http://clei.org/cleiej-beta/index.php/cleiej/article/view/302 |
id |
doaj-927db551c9204f83957be80e68e42290 |
---|---|
record_format |
Article |
spelling |
doaj-927db551c9204f83957be80e68e422902020-11-25T01:43:16ZengCentro Latinoamericano de Estudios en InformáticaCLEI Electronic Journal0717-50002006-12-019210.19153/cleiej.9.2.6Prediction of RNA Pseudoknotted Secondary Structure using Stochastic Context Free Grammars (SCFG)1Rafael García0Politécnico Grancolombiano, Facultad de Ingeniería y Ciencias Básicas Pseudoknots are a frequent RNA structure that assumes essential roles for varied biocatalyst cell’s functions. One of the most challenging fields in bioinformatics is the prediction of this secondary structure based on the base-pair sequence that dictates it. Previously, a model adapted from computational linguistics – Stochastic Context Free Grammars (SCFG) – has been used to predict RNA secondary structure. However, to this date the SCFG approach impose a prohibitive complexity cost [O(n4)] when they are applied to the prediction of pseudoknots, mainly because a context-sensitive grammar is formally required to analyze them. Other hybrids approaches (energy maximization) give a O(n3)complexity in the best case, besides having several restrictions in the maximum length of the sequence for practical analysis. Here we introduce a novel algorithm, based on pattern matching techniques, that uses a sequential approximation strategy to solve the original problem. This algorithm not only reduces the complexity to O(n2logn), but also widens the maximum length of the sequence, as well as the capacity of analyzing several pseudoknots simultaneously. http://clei.org/cleiej-beta/index.php/cleiej/article/view/302pseudoknotsStochastic Context Free Grammars (SCFG)secondary structure predictionRNAdynamic programming |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Rafael García |
spellingShingle |
Rafael García Prediction of RNA Pseudoknotted Secondary Structure using Stochastic Context Free Grammars (SCFG)1 CLEI Electronic Journal pseudoknots Stochastic Context Free Grammars (SCFG) secondary structure prediction RNA dynamic programming |
author_facet |
Rafael García |
author_sort |
Rafael García |
title |
Prediction of RNA Pseudoknotted Secondary Structure using Stochastic Context Free Grammars (SCFG)1 |
title_short |
Prediction of RNA Pseudoknotted Secondary Structure using Stochastic Context Free Grammars (SCFG)1 |
title_full |
Prediction of RNA Pseudoknotted Secondary Structure using Stochastic Context Free Grammars (SCFG)1 |
title_fullStr |
Prediction of RNA Pseudoknotted Secondary Structure using Stochastic Context Free Grammars (SCFG)1 |
title_full_unstemmed |
Prediction of RNA Pseudoknotted Secondary Structure using Stochastic Context Free Grammars (SCFG)1 |
title_sort |
prediction of rna pseudoknotted secondary structure using stochastic context free grammars (scfg)1 |
publisher |
Centro Latinoamericano de Estudios en Informática |
series |
CLEI Electronic Journal |
issn |
0717-5000 |
publishDate |
2006-12-01 |
description |
Pseudoknots are a frequent RNA structure that assumes essential roles for varied biocatalyst cell’s functions. One of the most challenging fields in bioinformatics is the prediction of this secondary structure based on the base-pair sequence that dictates it. Previously, a model adapted from computational linguistics – Stochastic Context Free Grammars (SCFG) – has been used to predict RNA secondary structure. However, to this date the SCFG approach impose a prohibitive complexity cost [O(n4)] when they are applied to the prediction of pseudoknots, mainly because a context-sensitive grammar is formally required to analyze them. Other hybrids approaches (energy maximization) give a O(n3)complexity in the best case, besides having several restrictions in the maximum length of the sequence for practical analysis.
Here we introduce a novel algorithm, based on pattern matching techniques, that uses a sequential approximation strategy to solve the original problem. This algorithm not only reduces the complexity to O(n2logn), but also widens the maximum length of the sequence, as well as the capacity of analyzing several pseudoknots simultaneously.
|
topic |
pseudoknots Stochastic Context Free Grammars (SCFG) secondary structure prediction RNA dynamic programming |
url |
http://clei.org/cleiej-beta/index.php/cleiej/article/view/302 |
work_keys_str_mv |
AT rafaelgarcia predictionofrnapseudoknottedsecondarystructureusingstochasticcontextfreegrammarsscfg1 |
_version_ |
1725032411903819776 |