Predictive modeling of plant messenger RNA polyadenylation sites

<p>Abstract</p> <p>Background</p> <p>One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integ...

Full description

Bibliographic Details
Main Authors: Loke Johnny C, Lin Yun, Jiang Ronghan, Wu Xiaohui, Shen Yingjia, Zheng Jianti, Ji Guoli, Davis Kimberly M, Reese Greg J, Li Qingshun
Format: Article
Language:English
Published: BMC 2007-02-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/8/43
id doaj-83434e948873405f926e9534be789873
record_format Article
spelling doaj-83434e948873405f926e9534be7898732020-11-25T00:14:39ZengBMCBMC Bioinformatics1471-21052007-02-01814310.1186/1471-2105-8-43Predictive modeling of plant messenger RNA polyadenylation sitesLoke Johnny CLin YunJiang RonghanWu XiaohuiShen YingjiaZheng JiantiJi GuoliDavis Kimberly MReese Greg JLi Qingshun<p>Abstract</p> <p>Background</p> <p>One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and translation machinery. The position of a poly(A) site is predetermined by signals in the pre-mRNA sequence that are recognized by a complex of polyadenylation factors. These signals are generally tri-part sequence patterns around the cleavage site that serves as the future poly(A) site. In plants, there is little sequence conservation among these signal elements, which makes it difficult to develop an accurate algorithm to predict the poly(A) site of a given gene. We attempted to solve this problem.</p> <p>Results</p> <p>Based on our current working model and the profile of nucleotide sequence distribution of the poly(A) signals and around poly(A) sites in Arabidopsis, we have devised a Generalized Hidden Markov Model based algorithm to predict potential poly(A) sites. The high specificity and sensitivity of the algorithm were demonstrated by testing several datasets, and at the best combinations, both reach 97%. The accuracy of the program, called <it>p</it>oly(<it>A</it>) <it>s</it>ite <it>s</it>leuth or <it>PASS</it>, has been demonstrated by the prediction of many validated poly(A) sites. <it>PASS </it>also predicted the changes of poly(A) site efficiency in poly(A) signal mutants that were constructed and characterized by traditional genetic experiments. The efficacy of <it>PASS </it>was demonstrated by predicting poly(A) sites within long genomic sequences.</p> <p>Conclusion</p> <p>Based on the features of plant poly(A) signals, a computational model was built to effectively predict the poly(A) sites in Arabidopsis genes. The algorithm will be useful in gene annotation because a poly(A) site signifies the end of the transcript. This algorithm can also be used to predict alternative poly(A) sites in known genes, and will be useful in the design of transgenes for crop genetic engineering by predicting and eliminating undesirable poly(A) sites.</p> http://www.biomedcentral.com/1471-2105/8/43
collection DOAJ
language English
format Article
sources DOAJ
author Loke Johnny C
Lin Yun
Jiang Ronghan
Wu Xiaohui
Shen Yingjia
Zheng Jianti
Ji Guoli
Davis Kimberly M
Reese Greg J
Li Qingshun
spellingShingle Loke Johnny C
Lin Yun
Jiang Ronghan
Wu Xiaohui
Shen Yingjia
Zheng Jianti
Ji Guoli
Davis Kimberly M
Reese Greg J
Li Qingshun
Predictive modeling of plant messenger RNA polyadenylation sites
BMC Bioinformatics
author_facet Loke Johnny C
Lin Yun
Jiang Ronghan
Wu Xiaohui
Shen Yingjia
Zheng Jianti
Ji Guoli
Davis Kimberly M
Reese Greg J
Li Qingshun
author_sort Loke Johnny C
title Predictive modeling of plant messenger RNA polyadenylation sites
title_short Predictive modeling of plant messenger RNA polyadenylation sites
title_full Predictive modeling of plant messenger RNA polyadenylation sites
title_fullStr Predictive modeling of plant messenger RNA polyadenylation sites
title_full_unstemmed Predictive modeling of plant messenger RNA polyadenylation sites
title_sort predictive modeling of plant messenger rna polyadenylation sites
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2007-02-01
description <p>Abstract</p> <p>Background</p> <p>One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and translation machinery. The position of a poly(A) site is predetermined by signals in the pre-mRNA sequence that are recognized by a complex of polyadenylation factors. These signals are generally tri-part sequence patterns around the cleavage site that serves as the future poly(A) site. In plants, there is little sequence conservation among these signal elements, which makes it difficult to develop an accurate algorithm to predict the poly(A) site of a given gene. We attempted to solve this problem.</p> <p>Results</p> <p>Based on our current working model and the profile of nucleotide sequence distribution of the poly(A) signals and around poly(A) sites in Arabidopsis, we have devised a Generalized Hidden Markov Model based algorithm to predict potential poly(A) sites. The high specificity and sensitivity of the algorithm were demonstrated by testing several datasets, and at the best combinations, both reach 97%. The accuracy of the program, called <it>p</it>oly(<it>A</it>) <it>s</it>ite <it>s</it>leuth or <it>PASS</it>, has been demonstrated by the prediction of many validated poly(A) sites. <it>PASS </it>also predicted the changes of poly(A) site efficiency in poly(A) signal mutants that were constructed and characterized by traditional genetic experiments. The efficacy of <it>PASS </it>was demonstrated by predicting poly(A) sites within long genomic sequences.</p> <p>Conclusion</p> <p>Based on the features of plant poly(A) signals, a computational model was built to effectively predict the poly(A) sites in Arabidopsis genes. The algorithm will be useful in gene annotation because a poly(A) site signifies the end of the transcript. This algorithm can also be used to predict alternative poly(A) sites in known genes, and will be useful in the design of transgenes for crop genetic engineering by predicting and eliminating undesirable poly(A) sites.</p>
url http://www.biomedcentral.com/1471-2105/8/43
work_keys_str_mv AT lokejohnnyc predictivemodelingofplantmessengerrnapolyadenylationsites
AT linyun predictivemodelingofplantmessengerrnapolyadenylationsites
AT jiangronghan predictivemodelingofplantmessengerrnapolyadenylationsites
AT wuxiaohui predictivemodelingofplantmessengerrnapolyadenylationsites
AT shenyingjia predictivemodelingofplantmessengerrnapolyadenylationsites
AT zhengjianti predictivemodelingofplantmessengerrnapolyadenylationsites
AT jiguoli predictivemodelingofplantmessengerrnapolyadenylationsites
AT daviskimberlym predictivemodelingofplantmessengerrnapolyadenylationsites
AT reesegregj predictivemodelingofplantmessengerrnapolyadenylationsites
AT liqingshun predictivemodelingofplantmessengerrnapolyadenylationsites
_version_ 1725389297680384000