Improving the Annotation of Arabidopsis lyrata Using RNA-Seq Data.

Gene model annotations are important community resources that ensure comparability and reproducibility of analyses and are typically the first step for functional annotation of genomic regions. Without up-to-date genome annotations, genome sequences cannot be used to maximum advantage. It is therefo...

Full description

Bibliographic Details
Main Authors: Vimal Rawat, Ahmed Abdelsamad, Björn Pietzenuk, Danelle K Seymour, Daniel Koenig, Detlef Weigel, Ales Pecinka, Korbinian Schneeberger
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4575116?pdf=render
id doaj-5cdb79e91f48472a814db38bc8b444d2
record_format Article
spelling doaj-5cdb79e91f48472a814db38bc8b444d22020-11-24T21:50:35ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-01109e013739110.1371/journal.pone.0137391Improving the Annotation of Arabidopsis lyrata Using RNA-Seq Data.Vimal RawatAhmed AbdelsamadBjörn PietzenukDanelle K SeymourDaniel KoenigDetlef WeigelAles PecinkaKorbinian SchneebergerGene model annotations are important community resources that ensure comparability and reproducibility of analyses and are typically the first step for functional annotation of genomic regions. Without up-to-date genome annotations, genome sequences cannot be used to maximum advantage. It is therefore essential to regularly update gene annotations by integrating the latest information to guarantee that reference annotations can remain a common basis for various types of analyses. Here, we report an improvement of the Arabidopsis lyrata gene annotation using extensive RNA-seq data. This new annotation consists of 31,132 protein coding gene models in addition to 2,089 genes with high similarity to transposable elements. Overall, ~87% of the gene models are corroborated by evidence of expression and 2,235 of these models feature multiple transcripts. Our updated gene annotation corrects hundreds of incorrectly split or merged gene models in the original annotation, and as a result the identification of alternative splicing events and differential isoform usage are vastly improved.http://europepmc.org/articles/PMC4575116?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Vimal Rawat
Ahmed Abdelsamad
Björn Pietzenuk
Danelle K Seymour
Daniel Koenig
Detlef Weigel
Ales Pecinka
Korbinian Schneeberger
spellingShingle Vimal Rawat
Ahmed Abdelsamad
Björn Pietzenuk
Danelle K Seymour
Daniel Koenig
Detlef Weigel
Ales Pecinka
Korbinian Schneeberger
Improving the Annotation of Arabidopsis lyrata Using RNA-Seq Data.
PLoS ONE
author_facet Vimal Rawat
Ahmed Abdelsamad
Björn Pietzenuk
Danelle K Seymour
Daniel Koenig
Detlef Weigel
Ales Pecinka
Korbinian Schneeberger
author_sort Vimal Rawat
title Improving the Annotation of Arabidopsis lyrata Using RNA-Seq Data.
title_short Improving the Annotation of Arabidopsis lyrata Using RNA-Seq Data.
title_full Improving the Annotation of Arabidopsis lyrata Using RNA-Seq Data.
title_fullStr Improving the Annotation of Arabidopsis lyrata Using RNA-Seq Data.
title_full_unstemmed Improving the Annotation of Arabidopsis lyrata Using RNA-Seq Data.
title_sort improving the annotation of arabidopsis lyrata using rna-seq data.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2015-01-01
description Gene model annotations are important community resources that ensure comparability and reproducibility of analyses and are typically the first step for functional annotation of genomic regions. Without up-to-date genome annotations, genome sequences cannot be used to maximum advantage. It is therefore essential to regularly update gene annotations by integrating the latest information to guarantee that reference annotations can remain a common basis for various types of analyses. Here, we report an improvement of the Arabidopsis lyrata gene annotation using extensive RNA-seq data. This new annotation consists of 31,132 protein coding gene models in addition to 2,089 genes with high similarity to transposable elements. Overall, ~87% of the gene models are corroborated by evidence of expression and 2,235 of these models feature multiple transcripts. Our updated gene annotation corrects hundreds of incorrectly split or merged gene models in the original annotation, and as a result the identification of alternative splicing events and differential isoform usage are vastly improved.
url http://europepmc.org/articles/PMC4575116?pdf=render
work_keys_str_mv AT vimalrawat improvingtheannotationofarabidopsislyratausingrnaseqdata
AT ahmedabdelsamad improvingtheannotationofarabidopsislyratausingrnaseqdata
AT bjornpietzenuk improvingtheannotationofarabidopsislyratausingrnaseqdata
AT danellekseymour improvingtheannotationofarabidopsislyratausingrnaseqdata
AT danielkoenig improvingtheannotationofarabidopsislyratausingrnaseqdata
AT detlefweigel improvingtheannotationofarabidopsislyratausingrnaseqdata
AT alespecinka improvingtheannotationofarabidopsislyratausingrnaseqdata
AT korbinianschneeberger improvingtheannotationofarabidopsislyratausingrnaseqdata
_version_ 1725883019109072896