Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]

We present a major public resource of mRNA splicing mutations validated according to multiple lines of evidence of abnormal gene expression. Likely mutations present in all tumor types reported in the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) were identified ba...

Full description

Bibliographic Details
Main Authors: Ben C. Shirley, Eliseos J. Mucaki, Peter K. Rogan
Format: Article
Language:English
Published: F1000 Research Ltd 2019-09-01
Series:F1000Research
Online Access:https://f1000research.com/articles/7-1908/v3
id doaj-276d14dba3ff461abb47fffea708f7d7
record_format Article
spelling doaj-276d14dba3ff461abb47fffea708f7d72020-11-25T03:20:46ZengF1000 Research LtdF1000Research2046-14022019-09-01710.12688/f1000research.17204.322439Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]Ben C. Shirley0Eliseos J. Mucaki1Peter K. Rogan2CytoGnomix Inc., London, Ontario, N5X 3X5, CanadaBiochemistry, University of Western Ontario, London, Ontario, N6A 2C1, CanadaComputer Science, University of Western Ontario, London, Ontario, N6A 2C1, CanadaWe present a major public resource of mRNA splicing mutations validated according to multiple lines of evidence of abnormal gene expression. Likely mutations present in all tumor types reported in the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) were identified based on the comparative strengths of splice sites in tumor versus normal genomes, and then validated by respectively comparing counts of splice junction spanning and abundance of transcript reads in RNA-Seq data from matched tissues and tumors lacking these mutations. The comprehensive resource features 341,486 of these validated mutations, the majority of which (69.9%) are not present in the Single Nucleotide Polymorphism Database (dbSNP 150). There are 131,347 unique mutations which weaken or abolish natural splice sites, and 222,071 mutations which strengthen cryptic splice sites (11,932 affect both simultaneously). 28,812 novel or rare flagged variants (with <1% population frequency in dbSNP) were observed in multiple tumor tissue types. An algorithm was developed to classify variants into splicing molecular phenotypes that integrates germline heterozygosity, degree of information change and impact on expression. The classification thresholds were calibrated against the ClinVar clinical database phenotypic assignments. Variants are partitioned into allele-specific alternative splicing, likely aberrant and aberrant splicing phenotypes. Single variants or chromosome ranges can be queried using a Global Alliance for Genomics and Health (GA4GH)-compliant, web-based Beacon “Validated Splicing Mutations” either separately or in aggregate alongside other Beacons through the public Beacon Network, as well as through our website. The website provides additional information, such as a visual representation of supporting RNAseq results, gene expression in the corresponding normal tissues, and splicing molecular phenotypes.https://f1000research.com/articles/7-1908/v3
collection DOAJ
language English
format Article
sources DOAJ
author Ben C. Shirley
Eliseos J. Mucaki
Peter K. Rogan
spellingShingle Ben C. Shirley
Eliseos J. Mucaki
Peter K. Rogan
Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]
F1000Research
author_facet Ben C. Shirley
Eliseos J. Mucaki
Peter K. Rogan
author_sort Ben C. Shirley
title Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]
title_short Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]
title_full Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]
title_fullStr Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]
title_full_unstemmed Pan-cancer repository of validated natural and cryptic mRNA splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]
title_sort pan-cancer repository of validated natural and cryptic mrna splicing mutations [version 3; peer review: 1 approved, 2 approved with reservations]
publisher F1000 Research Ltd
series F1000Research
issn 2046-1402
publishDate 2019-09-01
description We present a major public resource of mRNA splicing mutations validated according to multiple lines of evidence of abnormal gene expression. Likely mutations present in all tumor types reported in the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) were identified based on the comparative strengths of splice sites in tumor versus normal genomes, and then validated by respectively comparing counts of splice junction spanning and abundance of transcript reads in RNA-Seq data from matched tissues and tumors lacking these mutations. The comprehensive resource features 341,486 of these validated mutations, the majority of which (69.9%) are not present in the Single Nucleotide Polymorphism Database (dbSNP 150). There are 131,347 unique mutations which weaken or abolish natural splice sites, and 222,071 mutations which strengthen cryptic splice sites (11,932 affect both simultaneously). 28,812 novel or rare flagged variants (with <1% population frequency in dbSNP) were observed in multiple tumor tissue types. An algorithm was developed to classify variants into splicing molecular phenotypes that integrates germline heterozygosity, degree of information change and impact on expression. The classification thresholds were calibrated against the ClinVar clinical database phenotypic assignments. Variants are partitioned into allele-specific alternative splicing, likely aberrant and aberrant splicing phenotypes. Single variants or chromosome ranges can be queried using a Global Alliance for Genomics and Health (GA4GH)-compliant, web-based Beacon “Validated Splicing Mutations” either separately or in aggregate alongside other Beacons through the public Beacon Network, as well as through our website. The website provides additional information, such as a visual representation of supporting RNAseq results, gene expression in the corresponding normal tissues, and splicing molecular phenotypes.
url https://f1000research.com/articles/7-1908/v3
work_keys_str_mv AT bencshirley pancancerrepositoryofvalidatednaturalandcrypticmrnasplicingmutationsversion3peerreview1approved2approvedwithreservations
AT eliseosjmucaki pancancerrepositoryofvalidatednaturalandcrypticmrnasplicingmutationsversion3peerreview1approved2approvedwithreservations
AT peterkrogan pancancerrepositoryofvalidatednaturalandcrypticmrnasplicingmutationsversion3peerreview1approved2approvedwithreservations
_version_ 1724616660078297088