Haplotype-aware diplotyping from noisy long reads

Abstract Current genotyping approaches for single-nucleotide variations rely on short, accurate reads from second-generation sequencing devices. Presently, third-generation sequencing platforms are rapidly becoming more widespread, yet approaches for leveraging their long but error-prone reads for g...

Full description

Bibliographic Details
Main Authors: Jana Ebler, Marina Haukness, Trevor Pesout, Tobias Marschall, Benedict Paten
Format: Article
Language:English
Published: BMC 2019-06-01
Series:Genome Biology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13059-019-1709-0
id doaj-c9258fdd94314d80b31bd0c99666e958
record_format Article
spelling doaj-c9258fdd94314d80b31bd0c99666e9582020-11-25T03:31:23ZengBMCGenome Biology1474-760X2019-06-0120111610.1186/s13059-019-1709-0Haplotype-aware diplotyping from noisy long readsJana Ebler0Marina Haukness1Trevor Pesout2Tobias Marschall3Benedict Paten4Center for Bioinformatics, Saarland UniversityUC Santa Cruz Genomics Institute, University of California Santa CruzUC Santa Cruz Genomics Institute, University of California Santa CruzCenter for Bioinformatics, Saarland UniversityUC Santa Cruz Genomics Institute, University of California Santa CruzAbstract Current genotyping approaches for single-nucleotide variations rely on short, accurate reads from second-generation sequencing devices. Presently, third-generation sequencing platforms are rapidly becoming more widespread, yet approaches for leveraging their long but error-prone reads for genotyping are lacking. Here, we introduce a novel statistical framework for the joint inference of haplotypes and genotypes from noisy long reads, which we term diplotyping. Our technique takes full advantage of linkage information provided by long reads. We validate hundreds of thousands of candidate variants that have not yet been included in the high-confidence reference set of the Genome-in-a-Bottle effort.http://link.springer.com/article/10.1186/s13059-019-1709-0Computational genomicsLong readsGenotypingPhasingHaplotypesDiplotypes
collection DOAJ
language English
format Article
sources DOAJ
author Jana Ebler
Marina Haukness
Trevor Pesout
Tobias Marschall
Benedict Paten
spellingShingle Jana Ebler
Marina Haukness
Trevor Pesout
Tobias Marschall
Benedict Paten
Haplotype-aware diplotyping from noisy long reads
Genome Biology
Computational genomics
Long reads
Genotyping
Phasing
Haplotypes
Diplotypes
author_facet Jana Ebler
Marina Haukness
Trevor Pesout
Tobias Marschall
Benedict Paten
author_sort Jana Ebler
title Haplotype-aware diplotyping from noisy long reads
title_short Haplotype-aware diplotyping from noisy long reads
title_full Haplotype-aware diplotyping from noisy long reads
title_fullStr Haplotype-aware diplotyping from noisy long reads
title_full_unstemmed Haplotype-aware diplotyping from noisy long reads
title_sort haplotype-aware diplotyping from noisy long reads
publisher BMC
series Genome Biology
issn 1474-760X
publishDate 2019-06-01
description Abstract Current genotyping approaches for single-nucleotide variations rely on short, accurate reads from second-generation sequencing devices. Presently, third-generation sequencing platforms are rapidly becoming more widespread, yet approaches for leveraging their long but error-prone reads for genotyping are lacking. Here, we introduce a novel statistical framework for the joint inference of haplotypes and genotypes from noisy long reads, which we term diplotyping. Our technique takes full advantage of linkage information provided by long reads. We validate hundreds of thousands of candidate variants that have not yet been included in the high-confidence reference set of the Genome-in-a-Bottle effort.
topic Computational genomics
Long reads
Genotyping
Phasing
Haplotypes
Diplotypes
url http://link.springer.com/article/10.1186/s13059-019-1709-0
work_keys_str_mv AT janaebler haplotypeawarediplotypingfromnoisylongreads
AT marinahaukness haplotypeawarediplotypingfromnoisylongreads
AT trevorpesout haplotypeawarediplotypingfromnoisylongreads
AT tobiasmarschall haplotypeawarediplotypingfromnoisylongreads
AT benedictpaten haplotypeawarediplotypingfromnoisylongreads
_version_ 1724571919564406784