A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data

Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications o...

Full description

Bibliographic Details
Main Authors:	Tom Hill, Robert L. Unckless
Format:	Article
Language:	English
Published:	Oxford University Press 2019-11-01
Series:	G3: Genes, Genomes, Genetics
Subjects:	coverage deletion duplication machine-learning next-generation sequencing
Online Access:	http://g3journal.org/lookup/doi/10.1534/g3.119.400596

Description
Summary:	Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications of machine learning in genomics, we describe a method to detect duplications and deletions in short-read sequencing data. In low coverage data, machine learning appears to be more powerful in the detection of CNVs than the gold-standard methods of coverage estimation alone, and of equal power in high coverage data. We also demonstrate how replicating training sets allows a more precise detection of CNVs, even identifying novel CNVs in two genomes previously surveyed thoroughly for CNVs using long read data.
ISSN:	2160-1836

A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data

Similar Items