De novo assembly of leaf transcriptome in the medicinal plant Andrographis paniculata

Andrographis paniculata is an important medicinal plant containing various bioactive terpenoids and flavonoids. Despite its importance in herbal medicine, no ready-to-use transcript sequence information of this plant is made available in the public data base, this study mainly deals with the sequenc...

Full description

Bibliographic Details
Main Authors: Neeraja Cherukupalli, Mayur Divate, Suresh Reddy Mittapelli, Venkateswara Rao Khareedu, Dashavantha Reddy Vudem
Format: Article
Language:English
Published: Frontiers Media S.A. 2016-08-01
Series:Frontiers in Plant Science
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fpls.2016.01203/full
Description
Summary:Andrographis paniculata is an important medicinal plant containing various bioactive terpenoids and flavonoids. Despite its importance in herbal medicine, no ready-to-use transcript sequence information of this plant is made available in the public data base, this study mainly deals with the sequencing of RNA from A. paniculata leaf using Illumina HiSeqTM 2000 platform followed by the de novo transcriptome assembly. A total of 189.22 million high quality paired reads were generated and 1,70,724 transcripts were predicted in the primary assembly. Secondary assembly generated a transcriptome size of ~88 Mb with 83,800 clustered transcripts. Based on the similarity searches against plant nonredundant protein database, gene ontology and eukaryotic orthologous groups, 49,363 transcripts were annotated constituting upto 58.91% of the identified unigenes. Annotation of transcripts − using kyoto encyclopedia of genes and genomes database − revealed 5,606 transcripts plausibly involved in 140 pathways including biosynthesis of terpenoids and other secondary metabolites. Transcription factor analysis showed 6,767 unique transcripts belonging to 97 different transcription factor families. A total number of 124 CYP450 transcripts belonging to seven divergent clans have been identified. Transcriptome revealed 146 different transcripts coding for enzymes involved in the biosynthesis of terpenoids of which 35 contained terpene synthase motifs. This study also revealed 32,341 simple sequence repeats (SSRs) in 23,168 transcripts. Assembled sequences of transcriptome of A.paniculata generated in this study are made available, for the first time, in the TSA database, which provides useful information for functional and comparative genomic analyses besides identification of key enzymes involved in the various pathways of secondary metabolism.
ISSN:1664-462X