The design and construction of reference pangenome graphs with minigraph

Abstract The recent advances in sequencing technologies enable the assembly of individual genomes to the quality of the reference genome. How to integrate multiple genomes from the same species and make the integrated representation accessible to biologists remains an open challenge. Here, we propos...

Full description

Bibliographic Details
Main Authors: Heng Li, Xiaowen Feng, Chong Chu
Format: Article
Language:English
Published: BMC 2020-10-01
Series:Genome Biology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13059-020-02168-z
id doaj-70e3bfefab6a479999dc7cc28f1a7640
record_format Article
spelling doaj-70e3bfefab6a479999dc7cc28f1a76402020-11-25T04:08:11ZengBMCGenome Biology1474-760X2020-10-0121111910.1186/s13059-020-02168-zThe design and construction of reference pangenome graphs with minigraphHeng Li0Xiaowen Feng1Chong Chu2Department of Data Sciences, Dana-Farber Cancer InstituteDepartment of Data Sciences, Dana-Farber Cancer InstituteDepartment of Biomedical Informatics, Harvard Medical SchoolAbstract The recent advances in sequencing technologies enable the assembly of individual genomes to the quality of the reference genome. How to integrate multiple genomes from the same species and make the integrated representation accessible to biologists remains an open challenge. Here, we propose a graph-based data model and associated formats to represent multiple genomes while preserving the coordinate of the linear reference genome. We implement our ideas in the minigraph toolkit and demonstrate that we can efficiently construct a pangenome graph and compactly encode tens of thousands of structural variants missing from the current reference genome.http://link.springer.com/article/10.1186/s13059-020-02168-zBioinformaticsGenomicsPangenome
collection DOAJ
language English
format Article
sources DOAJ
author Heng Li
Xiaowen Feng
Chong Chu
spellingShingle Heng Li
Xiaowen Feng
Chong Chu
The design and construction of reference pangenome graphs with minigraph
Genome Biology
Bioinformatics
Genomics
Pangenome
author_facet Heng Li
Xiaowen Feng
Chong Chu
author_sort Heng Li
title The design and construction of reference pangenome graphs with minigraph
title_short The design and construction of reference pangenome graphs with minigraph
title_full The design and construction of reference pangenome graphs with minigraph
title_fullStr The design and construction of reference pangenome graphs with minigraph
title_full_unstemmed The design and construction of reference pangenome graphs with minigraph
title_sort design and construction of reference pangenome graphs with minigraph
publisher BMC
series Genome Biology
issn 1474-760X
publishDate 2020-10-01
description Abstract The recent advances in sequencing technologies enable the assembly of individual genomes to the quality of the reference genome. How to integrate multiple genomes from the same species and make the integrated representation accessible to biologists remains an open challenge. Here, we propose a graph-based data model and associated formats to represent multiple genomes while preserving the coordinate of the linear reference genome. We implement our ideas in the minigraph toolkit and demonstrate that we can efficiently construct a pangenome graph and compactly encode tens of thousands of structural variants missing from the current reference genome.
topic Bioinformatics
Genomics
Pangenome
url http://link.springer.com/article/10.1186/s13059-020-02168-z
work_keys_str_mv AT hengli thedesignandconstructionofreferencepangenomegraphswithminigraph
AT xiaowenfeng thedesignandconstructionofreferencepangenomegraphswithminigraph
AT chongchu thedesignandconstructionofreferencepangenomegraphswithminigraph
AT hengli designandconstructionofreferencepangenomegraphswithminigraph
AT xiaowenfeng designandconstructionofreferencepangenomegraphswithminigraph
AT chongchu designandconstructionofreferencepangenomegraphswithminigraph
_version_ 1724426388922957824