Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]
We present a major public resource of mRNA splicing mutations validated according to multiple lines of evidence of abnormal gene expression. Likely mutations present in all tumor types reported in the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) were identified ba...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
F1000 Research Ltd
2019-09-01
|
Series: | F1000Research |
Online Access: | https://f1000research.com/articles/7-1908/v3 |
id |
doaj-276d14dba3ff461abb47fffea708f7d7 |
---|---|
record_format |
Article |
spelling |
doaj-276d14dba3ff461abb47fffea708f7d72020-11-25T03:20:46ZengF1000 Research LtdF1000Research2046-14022019-09-01710.12688/f1000research.17204.322439Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]Ben C. Shirley0Eliseos J. Mucaki1Peter K. Rogan2CytoGnomix Inc., London, Ontario, N5X 3X5, CanadaBiochemistry, University of Western Ontario, London, Ontario, N6A 2C1, CanadaComputer Science, University of Western Ontario, London, Ontario, N6A 2C1, CanadaWe present a major public resource of mRNA splicing mutations validated according to multiple lines of evidence of abnormal gene expression. Likely mutations present in all tumor types reported in the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) were identified based on the comparative strengths of splice sites in tumor versus normal genomes, and then validated by respectively comparing counts of splice junction spanning and abundance of transcript reads in RNA-Seq data from matched tissues and tumors lacking these mutations. The comprehensive resource features 341,486 of these validated mutations, the majority of which (69.9%) are not present in the Single Nucleotide Polymorphism Database (dbSNP 150). There are 131,347 unique mutations which weaken or abolish natural splice sites, and 222,071 mutations which strengthen cryptic splice sites (11,932 affect both simultaneously). 28,812 novel or rare flagged variants (with <1% population frequency in dbSNP) were observed in multiple tumor tissue types. An algorithm was developed to classify variants into splicing molecular phenotypes that integrates germline heterozygosity, degree of information change and impact on expression. The classification thresholds were calibrated against the ClinVar clinical database phenotypic assignments. Variants are partitioned into allele-specific alternative splicing, likely aberrant and aberrant splicing phenotypes. Single variants or chromosome ranges can be queried using a Global Alliance for Genomics and Health (GA4GH)-compliant, web-based Beacon “Validated Splicing Mutations” either separately or in aggregate alongside other Beacons through the public Beacon Network, as well as through our website. The website provides additional information, such as a visual representation of supporting RNAseq results, gene expression in the corresponding normal tissues, and splicing molecular phenotypes.https://f1000research.com/articles/7-1908/v3 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ben C. Shirley Eliseos J. Mucaki Peter K. Rogan |
spellingShingle |
Ben C. Shirley Eliseos J. Mucaki Peter K. Rogan Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations] F1000Research |
author_facet |
Ben C. Shirley Eliseos J. Mucaki Peter K. Rogan |
author_sort |
Ben C. Shirley |
title |
Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations] |
title_short |
Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations] |
title_full |
Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations] |
title_fullStr |
Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations] |
title_full_unstemmed |
Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations] |
title_sort |
pan-cancer repository of validated natural and cryptic mrna splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations] |
publisher |
F1000 Research Ltd |
series |
F1000Research |
issn |
2046-1402 |
publishDate |
2019-09-01 |
description |
We present a major public resource of mRNA splicing mutations validated according to multiple lines of evidence of abnormal gene expression. Likely mutations present in all tumor types reported in the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) were identified based on the comparative strengths of splice sites in tumor versus normal genomes, and then validated by respectively comparing counts of splice junction spanning and abundance of transcript reads in RNA-Seq data from matched tissues and tumors lacking these mutations. The comprehensive resource features 341,486 of these validated mutations, the majority of which (69.9%) are not present in the Single Nucleotide Polymorphism Database (dbSNP 150). There are 131,347 unique mutations which weaken or abolish natural splice sites, and 222,071 mutations which strengthen cryptic splice sites (11,932 affect both simultaneously). 28,812 novel or rare flagged variants (with <1% population frequency in dbSNP) were observed in multiple tumor tissue types. An algorithm was developed to classify variants into splicing molecular phenotypes that integrates germline heterozygosity, degree of information change and impact on expression. The classification thresholds were calibrated against the ClinVar clinical database phenotypic assignments. Variants are partitioned into allele-specific alternative splicing, likely aberrant and aberrant splicing phenotypes. Single variants or chromosome ranges can be queried using a Global Alliance for Genomics and Health (GA4GH)-compliant, web-based Beacon “Validated Splicing Mutations” either separately or in aggregate alongside other Beacons through the public Beacon Network, as well as through our website. The website provides additional information, such as a visual representation of supporting RNAseq results, gene expression in the corresponding normal tissues, and splicing molecular phenotypes. |
url |
https://f1000research.com/articles/7-1908/v3 |
work_keys_str_mv |
AT bencshirley pancancerrepositoryofvalidatednaturalandcrypticmrnasplicingmutationsversion3peerreview1approved2approvedwithreservations AT eliseosjmucaki pancancerrepositoryofvalidatednaturalandcrypticmrnasplicingmutationsversion3peerreview1approved2approvedwithreservations AT peterkrogan pancancerrepositoryofvalidatednaturalandcrypticmrnasplicingmutationsversion3peerreview1approved2approvedwithreservations |
_version_ |
1724616660078297088 |