Summary: | 碩士 === 國立交通大學 === 資訊科學與工程研究所 === 95 === Motivation: RNA molecules are the key players in the biochemistry of the cell, playing many important roles in regulation, catalysis and structural support. Many functional RNAs have evolutionarily conserved secondary structures in order to fulfill their roles in a cell. Although current approaches can identify common structure motifs from a set of RNAs, they typically rely on the assumption that the given sequences are from a single family, which is not necessarily true in practice.
Results: We develop a new method based on structure decomposition and Gibbs sampling to predict consensus structure motifs in unaligned RNA sequences. Unlike most current approaches, our method is applicable to a set of mixed sequences from different families, and is able to predict multiple motifs for multiple families. Furthermore, as we separate motif finding from sequence folding in our system, new folding algorithms other than Mfold or RNAfold, etc. can be easily integrated with our motif finding process. Extensive testing on 17 families from Rfam shows that our method competes well with other current tools in single family predictions. As for multi-family predictions, experiments also demonstrate that our new approach outperforms recent alternative methods.
|