Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours

Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumour genome - in particular single nucleotide variants (SNVs). However, most current computational and statistical models for analyzing next generation sequencing data do not account fo...

Full description

Bibliographic Details
Main Author: Anamaria, Crisan
Language:English
Published: University of British Columbia 2010
Online Access:http://hdl.handle.net/2429/29454
id ndltd-UBC-oai-circle.library.ubc.ca-2429-29454
record_format oai_dc
spelling ndltd-UBC-oai-circle.library.ubc.ca-2429-294542018-01-05T17:24:39Z Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours Anamaria, Crisan Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumour genome - in particular single nucleotide variants (SNVs). However, most current computational and statistical models for analyzing next generation sequencing data do not account for cancer-specific biological properties, including somatic segmental copy number alterations (CNAs), which require special treatment of the data. Here we present CoNAn-SNV (Copy Number Annotated –SNV): a novel algorithm for the inference of single nucleotide variants (SNVs) that overlap copy number alterations. The method is based on modelling the notion that genomic regions of segmental duplication and amplification induce an extended ‘genotype space’ where a subset of genotypes will exhibit heavily skewed allelic distributions in SNVs (and therefore render them undetectable by methods that assume diploidy). CoNAn-SNV introduces the concept of modelling allelic counts from sequencing data using a panel of Binomial mixture models where the number of mixtures for a given locus in the genome is informed by a discrete copy number state given as input. We applied CoNAn-SNV to a previously published whole genome shotgun data set obtained from a lobular breast cancer and show that it is able to detect 24 experimentally revalidated somatic non-synonymous mutations that were not detected using copy number insensitive SNV discovery algorithms. Importantly, ROC analysis shows that the increased sensitivity of CoNAn-SNV does not result in disproportionate loss of specificity. Our results indicate that in genomically unstable tumours, copy number annotation for SNV detection will be critical to fully characterize the mutational landscape of cancer genomes. The Binomial mixture model framework, however, is unable to fully cope with the complexity of tumour sequence data. We explore substituting the Binomial mixture model framework with the Beta-Binomial framework to overcome this limitation. The effectiveness of this substitution is compared against the lobular breast carcinoma and the 30 exon capture data sets all derived from triple negative breast cancers. The performance of Binomial and Beta-Binomial mixture model is evaluated against a cohort of exon capture test cases and we report ROC and f-measures. Science, Faculty of Graduate 2010-10-22T15:15:53Z 2010-10-22T15:15:53Z 2010 2010-11 Text Thesis/Dissertation http://hdl.handle.net/2429/29454 eng Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ University of British Columbia
collection NDLTD
language English
sources NDLTD
description Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumour genome - in particular single nucleotide variants (SNVs). However, most current computational and statistical models for analyzing next generation sequencing data do not account for cancer-specific biological properties, including somatic segmental copy number alterations (CNAs), which require special treatment of the data. Here we present CoNAn-SNV (Copy Number Annotated –SNV): a novel algorithm for the inference of single nucleotide variants (SNVs) that overlap copy number alterations. The method is based on modelling the notion that genomic regions of segmental duplication and amplification induce an extended ‘genotype space’ where a subset of genotypes will exhibit heavily skewed allelic distributions in SNVs (and therefore render them undetectable by methods that assume diploidy). CoNAn-SNV introduces the concept of modelling allelic counts from sequencing data using a panel of Binomial mixture models where the number of mixtures for a given locus in the genome is informed by a discrete copy number state given as input. We applied CoNAn-SNV to a previously published whole genome shotgun data set obtained from a lobular breast cancer and show that it is able to detect 24 experimentally revalidated somatic non-synonymous mutations that were not detected using copy number insensitive SNV discovery algorithms. Importantly, ROC analysis shows that the increased sensitivity of CoNAn-SNV does not result in disproportionate loss of specificity. Our results indicate that in genomically unstable tumours, copy number annotation for SNV detection will be critical to fully characterize the mutational landscape of cancer genomes. The Binomial mixture model framework, however, is unable to fully cope with the complexity of tumour sequence data. We explore substituting the Binomial mixture model framework with the Beta-Binomial framework to overcome this limitation. The effectiveness of this substitution is compared against the lobular breast carcinoma and the 30 exon capture data sets all derived from triple negative breast cancers. The performance of Binomial and Beta-Binomial mixture model is evaluated against a cohort of exon capture test cases and we report ROC and f-measures. === Science, Faculty of === Graduate
author Anamaria, Crisan
spellingShingle Anamaria, Crisan
Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours
author_facet Anamaria, Crisan
author_sort Anamaria, Crisan
title Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours
title_short Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours
title_full Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours
title_fullStr Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours
title_full_unstemmed Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours
title_sort mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours
publisher University of British Columbia
publishDate 2010
url http://hdl.handle.net/2429/29454
work_keys_str_mv AT anamariacrisan mutationdiscoveryinregionsofsegmentalcancergenomeamplificationsfromnextgenerationsequencingoftumours
_version_ 1718582670160560128