Recognition of unknown conserved alternatively spliced exons.

The split structure of most mammalian protein-coding genes allows for the potential to produce multiple different mRNA and protein isoforms from a single gene locus through the process of alternative splicing (AS). We propose a computational approach called UNCOVER based on a pair hidden Markov mode...

Full description

Bibliographic Details
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2005-07-01
Series:PLoS Computational Biology
Online Access:http://dx.doi.org/10.1371/journal.pcbi.0010015
id doaj-f584bc0d58e14616965d32b43431c58b
record_format Article
spelling doaj-f584bc0d58e14616965d32b43431c58b2020-11-25T00:45:54ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582005-07-0112e15Recognition of unknown conserved alternatively spliced exons.The split structure of most mammalian protein-coding genes allows for the potential to produce multiple different mRNA and protein isoforms from a single gene locus through the process of alternative splicing (AS). We propose a computational approach called UNCOVER based on a pair hidden Markov model to discover conserved coding exonic sequences subject to AS that have so far gone undetected. Applying UNCOVER to orthologous introns of known human and mouse genes predicts skipped exons or retained introns present in both species, while discriminating them from conserved noncoding sequences. The accuracy of the model is evaluated on a curated set of genes with known conserved AS events. The prediction of skipped exons in the ~1% of the human genome represented by the ENCODE regions leads to more than 50 new exon candidates. Five novel predicted AS exons were validated by RT-PCR and sequencing analysis of 15 introns with strong UNCOVER predictions and lacking EST evidence. These results imply that a considerable number of conserved exonic sequences and associated isoforms are still completely missing from the current annotation of known genes. UNCOVER also identifies a small number of candidates for conserved intron retention.http://dx.doi.org/10.1371/journal.pcbi.0010015
collection DOAJ
language English
format Article
sources DOAJ
title Recognition of unknown conserved alternatively spliced exons.
spellingShingle Recognition of unknown conserved alternatively spliced exons.
PLoS Computational Biology
title_short Recognition of unknown conserved alternatively spliced exons.
title_full Recognition of unknown conserved alternatively spliced exons.
title_fullStr Recognition of unknown conserved alternatively spliced exons.
title_full_unstemmed Recognition of unknown conserved alternatively spliced exons.
title_sort recognition of unknown conserved alternatively spliced exons.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2005-07-01
description The split structure of most mammalian protein-coding genes allows for the potential to produce multiple different mRNA and protein isoforms from a single gene locus through the process of alternative splicing (AS). We propose a computational approach called UNCOVER based on a pair hidden Markov model to discover conserved coding exonic sequences subject to AS that have so far gone undetected. Applying UNCOVER to orthologous introns of known human and mouse genes predicts skipped exons or retained introns present in both species, while discriminating them from conserved noncoding sequences. The accuracy of the model is evaluated on a curated set of genes with known conserved AS events. The prediction of skipped exons in the ~1% of the human genome represented by the ENCODE regions leads to more than 50 new exon candidates. Five novel predicted AS exons were validated by RT-PCR and sequencing analysis of 15 introns with strong UNCOVER predictions and lacking EST evidence. These results imply that a considerable number of conserved exonic sequences and associated isoforms are still completely missing from the current annotation of known genes. UNCOVER also identifies a small number of candidates for conserved intron retention.
url http://dx.doi.org/10.1371/journal.pcbi.0010015
_version_ 1725268144278208512