A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data
Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications o...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Oxford University Press
2019-11-01
|
Series: | G3: Genes, Genomes, Genetics |
Subjects: | |
Online Access: | http://g3journal.org/lookup/doi/10.1534/g3.119.400596 |
id |
doaj-ba44e60bc6b24ec187e15a04c00cefc2 |
---|---|
record_format |
Article |
spelling |
doaj-ba44e60bc6b24ec187e15a04c00cefc22021-07-02T12:26:05ZengOxford University PressG3: Genes, Genomes, Genetics2160-18362019-11-019113575358210.1534/g3.119.4005968A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing DataTom HillRobert L. UncklessCopy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications of machine learning in genomics, we describe a method to detect duplications and deletions in short-read sequencing data. In low coverage data, machine learning appears to be more powerful in the detection of CNVs than the gold-standard methods of coverage estimation alone, and of equal power in high coverage data. We also demonstrate how replicating training sets allows a more precise detection of CNVs, even identifying novel CNVs in two genomes previously surveyed thoroughly for CNVs using long read data.http://g3journal.org/lookup/doi/10.1534/g3.119.400596coveragedeletionduplicationmachine-learningnext-generation sequencing |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Tom Hill Robert L. Unckless |
spellingShingle |
Tom Hill Robert L. Unckless A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data G3: Genes, Genomes, Genetics coverage deletion duplication machine-learning next-generation sequencing |
author_facet |
Tom Hill Robert L. Unckless |
author_sort |
Tom Hill |
title |
A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title_short |
A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title_full |
A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title_fullStr |
A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title_full_unstemmed |
A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data |
title_sort |
deep learning approach for detecting copy number variation in next-generation sequencing data |
publisher |
Oxford University Press |
series |
G3: Genes, Genomes, Genetics |
issn |
2160-1836 |
publishDate |
2019-11-01 |
description |
Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications of machine learning in genomics, we describe a method to detect duplications and deletions in short-read sequencing data. In low coverage data, machine learning appears to be more powerful in the detection of CNVs than the gold-standard methods of coverage estimation alone, and of equal power in high coverage data. We also demonstrate how replicating training sets allows a more precise detection of CNVs, even identifying novel CNVs in two genomes previously surveyed thoroughly for CNVs using long read data. |
topic |
coverage deletion duplication machine-learning next-generation sequencing |
url |
http://g3journal.org/lookup/doi/10.1534/g3.119.400596 |
work_keys_str_mv |
AT tomhill adeeplearningapproachfordetectingcopynumbervariationinnextgenerationsequencingdata AT robertlunckless adeeplearningapproachfordetectingcopynumbervariationinnextgenerationsequencingdata AT tomhill deeplearningapproachfordetectingcopynumbervariationinnextgenerationsequencingdata AT robertlunckless deeplearningapproachfordetectingcopynumbervariationinnextgenerationsequencingdata |
_version_ |
1721330197092892672 |