Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.

Known protein coding gene exons compose less than 3% of the human genome. The remaining 97% is largely uncharted territory, with only a small fraction characterized. The recent observation of transcription in this intergenic territory has stimulated debate about the extent of intergenic transcriptio...

Full description

Bibliographic Details
Main Authors: Matthew J Hangauer, Ian W Vaughn, Michael T McManus
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-06-01
Series:PLoS Genetics
Online Access:http://europepmc.org/articles/PMC3688513?pdf=render
id doaj-87611a43e73e48e6aa73982e0b6ae344
record_format Article
spelling doaj-87611a43e73e48e6aa73982e0b6ae3442020-11-24T22:20:28ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042013-06-0196e100356910.1371/journal.pgen.1003569Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.Matthew J HangauerIan W VaughnMichael T McManusKnown protein coding gene exons compose less than 3% of the human genome. The remaining 97% is largely uncharted territory, with only a small fraction characterized. The recent observation of transcription in this intergenic territory has stimulated debate about the extent of intergenic transcription and whether these intergenic RNAs are functional. Here we directly observed with a large set of RNA-seq data covering a wide array of human tissue types that the majority of the genome is indeed transcribed, corroborating recent observations by the ENCODE project. Furthermore, using de novo transcriptome assembly of this RNA-seq data, we found that intergenic regions encode far more long intergenic noncoding RNAs (lincRNAs) than previously described, helping to resolve the discrepancy between the vast amount of observed intergenic transcription and the limited number of previously known lincRNAs. In total, we identified tens of thousands of putative lincRNAs expressed at a minimum of one copy per cell, significantly expanding upon prior lincRNA annotation sets. These lincRNAs are specifically regulated and conserved rather than being the product of transcriptional noise. In addition, lincRNAs are strongly enriched for trait-associated SNPs suggesting a new mechanism by which intergenic trait-associated regions may function. These findings will enable the discovery and interrogation of novel intergenic functional elements.http://europepmc.org/articles/PMC3688513?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Matthew J Hangauer
Ian W Vaughn
Michael T McManus
spellingShingle Matthew J Hangauer
Ian W Vaughn
Michael T McManus
Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.
PLoS Genetics
author_facet Matthew J Hangauer
Ian W Vaughn
Michael T McManus
author_sort Matthew J Hangauer
title Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.
title_short Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.
title_full Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.
title_fullStr Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.
title_full_unstemmed Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.
title_sort pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding rnas.
publisher Public Library of Science (PLoS)
series PLoS Genetics
issn 1553-7390
1553-7404
publishDate 2013-06-01
description Known protein coding gene exons compose less than 3% of the human genome. The remaining 97% is largely uncharted territory, with only a small fraction characterized. The recent observation of transcription in this intergenic territory has stimulated debate about the extent of intergenic transcription and whether these intergenic RNAs are functional. Here we directly observed with a large set of RNA-seq data covering a wide array of human tissue types that the majority of the genome is indeed transcribed, corroborating recent observations by the ENCODE project. Furthermore, using de novo transcriptome assembly of this RNA-seq data, we found that intergenic regions encode far more long intergenic noncoding RNAs (lincRNAs) than previously described, helping to resolve the discrepancy between the vast amount of observed intergenic transcription and the limited number of previously known lincRNAs. In total, we identified tens of thousands of putative lincRNAs expressed at a minimum of one copy per cell, significantly expanding upon prior lincRNA annotation sets. These lincRNAs are specifically regulated and conserved rather than being the product of transcriptional noise. In addition, lincRNAs are strongly enriched for trait-associated SNPs suggesting a new mechanism by which intergenic trait-associated regions may function. These findings will enable the discovery and interrogation of novel intergenic functional elements.
url http://europepmc.org/articles/PMC3688513?pdf=render
work_keys_str_mv AT matthewjhangauer pervasivetranscriptionofthehumangenomeproducesthousandsofpreviouslyunidentifiedlongintergenicnoncodingrnas
AT ianwvaughn pervasivetranscriptionofthehumangenomeproducesthousandsofpreviouslyunidentifiedlongintergenicnoncodingrnas
AT michaeltmcmanus pervasivetranscriptionofthehumangenomeproducesthousandsofpreviouslyunidentifiedlongintergenicnoncodingrnas
_version_ 1725775154549620736