Comprehensive transcriptome analysis of the highly complex <it>Pisum sativum </it>genome using next generation sequencing

<p>Abstract</p> <p>Background</p> <p>The garden pea, <it>Pisum sativum</it>, is among the best-investigated legume plants and of significant agro-commercial relevance. <it>Pisum sativum </it>has a large and complex genome and accordingly few comp...

Full description

Bibliographic Details
Main Authors: Bräutigam Andrea, Shrestha Roshan P, Franssen Susanne U, Bornberg-Bauer Erich, Weber Andreas PM
Format: Article
Language:English
Published: BMC 2011-05-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/12/227
id doaj-4bf4ef23d9d24da188cd135a21ba77c3
record_format Article
spelling doaj-4bf4ef23d9d24da188cd135a21ba77c32020-11-24T20:42:13ZengBMCBMC Genomics1471-21642011-05-0112122710.1186/1471-2164-12-227Comprehensive transcriptome analysis of the highly complex <it>Pisum sativum </it>genome using next generation sequencingBräutigam AndreaShrestha Roshan PFranssen Susanne UBornberg-Bauer ErichWeber Andreas PM<p>Abstract</p> <p>Background</p> <p>The garden pea, <it>Pisum sativum</it>, is among the best-investigated legume plants and of significant agro-commercial relevance. <it>Pisum sativum </it>has a large and complex genome and accordingly few comprehensive genomic resources exist.</p> <p>Results</p> <p>We analyzed the pea transcriptome at the highest possible amount of accuracy by current technology. We used next generation sequencing with the Roche/454 platform and evaluated and compared a variety of approaches, including diverse tissue libraries, normalization, alternative sequencing technologies, saturation estimation and diverse assembly strategies. We generated libraries from flowers, leaves, cotyledons, epi- and hypocotyl, and etiolated and light treated etiolated seedlings, comprising a total of 450 megabases. Libraries were assembled into 324,428 unigenes in a first pass assembly.</p> <p>A second pass assembly reduced the amount to 81,449 unigenes but caused a significant number of chimeras. Analyses of the assemblies identified the assembly step as a major possibility for improvement. By recording frequencies of Arabidopsis orthologs hit by randomly drawn reads and fitting parameters of the saturation curve we concluded that sequencing was exhaustive. For leaf libraries we found normalization allows partial recovery of expression strength aside the desired effect of increased coverage. Based on theoretical and biological considerations we concluded that the sequence reads in the database tagged the vast majority of transcripts in the aerial tissues. A pathway representation analysis showed the merits of sampling multiple aerial tissues to increase the number of tagged genes. All results have been made available as a fully annotated database in fasta format.</p> <p>Conclusions</p> <p>We conclude that the approach taken resulted in a high quality - dataset which serves well as a first comprehensive reference set for the model legume pea. We suggest future deep sequencing transcriptome projects of species lacking a genomics backbone will need to concentrate mainly on resolving the issues of redundancy and paralogy during transcriptome assembly.</p> http://www.biomedcentral.com/1471-2164/12/227
collection DOAJ
language English
format Article
sources DOAJ
author Bräutigam Andrea
Shrestha Roshan P
Franssen Susanne U
Bornberg-Bauer Erich
Weber Andreas PM
spellingShingle Bräutigam Andrea
Shrestha Roshan P
Franssen Susanne U
Bornberg-Bauer Erich
Weber Andreas PM
Comprehensive transcriptome analysis of the highly complex <it>Pisum sativum </it>genome using next generation sequencing
BMC Genomics
author_facet Bräutigam Andrea
Shrestha Roshan P
Franssen Susanne U
Bornberg-Bauer Erich
Weber Andreas PM
author_sort Bräutigam Andrea
title Comprehensive transcriptome analysis of the highly complex <it>Pisum sativum </it>genome using next generation sequencing
title_short Comprehensive transcriptome analysis of the highly complex <it>Pisum sativum </it>genome using next generation sequencing
title_full Comprehensive transcriptome analysis of the highly complex <it>Pisum sativum </it>genome using next generation sequencing
title_fullStr Comprehensive transcriptome analysis of the highly complex <it>Pisum sativum </it>genome using next generation sequencing
title_full_unstemmed Comprehensive transcriptome analysis of the highly complex <it>Pisum sativum </it>genome using next generation sequencing
title_sort comprehensive transcriptome analysis of the highly complex <it>pisum sativum </it>genome using next generation sequencing
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2011-05-01
description <p>Abstract</p> <p>Background</p> <p>The garden pea, <it>Pisum sativum</it>, is among the best-investigated legume plants and of significant agro-commercial relevance. <it>Pisum sativum </it>has a large and complex genome and accordingly few comprehensive genomic resources exist.</p> <p>Results</p> <p>We analyzed the pea transcriptome at the highest possible amount of accuracy by current technology. We used next generation sequencing with the Roche/454 platform and evaluated and compared a variety of approaches, including diverse tissue libraries, normalization, alternative sequencing technologies, saturation estimation and diverse assembly strategies. We generated libraries from flowers, leaves, cotyledons, epi- and hypocotyl, and etiolated and light treated etiolated seedlings, comprising a total of 450 megabases. Libraries were assembled into 324,428 unigenes in a first pass assembly.</p> <p>A second pass assembly reduced the amount to 81,449 unigenes but caused a significant number of chimeras. Analyses of the assemblies identified the assembly step as a major possibility for improvement. By recording frequencies of Arabidopsis orthologs hit by randomly drawn reads and fitting parameters of the saturation curve we concluded that sequencing was exhaustive. For leaf libraries we found normalization allows partial recovery of expression strength aside the desired effect of increased coverage. Based on theoretical and biological considerations we concluded that the sequence reads in the database tagged the vast majority of transcripts in the aerial tissues. A pathway representation analysis showed the merits of sampling multiple aerial tissues to increase the number of tagged genes. All results have been made available as a fully annotated database in fasta format.</p> <p>Conclusions</p> <p>We conclude that the approach taken resulted in a high quality - dataset which serves well as a first comprehensive reference set for the model legume pea. We suggest future deep sequencing transcriptome projects of species lacking a genomics backbone will need to concentrate mainly on resolving the issues of redundancy and paralogy during transcriptome assembly.</p>
url http://www.biomedcentral.com/1471-2164/12/227
work_keys_str_mv AT brautigamandrea comprehensivetranscriptomeanalysisofthehighlycomplexitpisumsativumitgenomeusingnextgenerationsequencing
AT shrestharoshanp comprehensivetranscriptomeanalysisofthehighlycomplexitpisumsativumitgenomeusingnextgenerationsequencing
AT franssensusanneu comprehensivetranscriptomeanalysisofthehighlycomplexitpisumsativumitgenomeusingnextgenerationsequencing
AT bornbergbauererich comprehensivetranscriptomeanalysisofthehighlycomplexitpisumsativumitgenomeusingnextgenerationsequencing
AT weberandreaspm comprehensivetranscriptomeanalysisofthehighlycomplexitpisumsativumitgenomeusingnextgenerationsequencing
_version_ 1716822805604990976