When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data

Abstract Considerable advances in genomics over the past decade have resulted in vast amounts of data being generated and deposited in global archives. The growth of these archives exceeds our ability to process their content, leading to significant analysis bottlenecks. Sketching algorithms produce...

Full description

Bibliographic Details
Main Author: Will P. M. Rowe
Format: Article
Language:English
Published: BMC 2019-09-01
Series:Genome Biology
Online Access:http://link.springer.com/article/10.1186/s13059-019-1809-x
id doaj-a91f2ba74bbb4093a0eb4d40e57746c4
record_format Article
spelling doaj-a91f2ba74bbb4093a0eb4d40e57746c42020-11-25T03:40:07ZengBMCGenome Biology1474-760X2019-09-0120111210.1186/s13059-019-1809-xWhen the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic dataWill P. M. Rowe0Institute of Microbiology and Infection, School of Biosciences, University of BirminghamAbstract Considerable advances in genomics over the past decade have resulted in vast amounts of data being generated and deposited in global archives. The growth of these archives exceeds our ability to process their content, leading to significant analysis bottlenecks. Sketching algorithms produce small, approximate summaries of data and have shown great utility in tackling this flood of genomic data, while using minimal compute resources. This article reviews the current state of the field, focusing on how the algorithms work and how genomicists can utilize them effectively. References to interactive workbooks for explaining concepts and demonstrating workflows are included at https://github.com/will-rowe/genome-sketching.http://link.springer.com/article/10.1186/s13059-019-1809-x
collection DOAJ
language English
format Article
sources DOAJ
author Will P. M. Rowe
spellingShingle Will P. M. Rowe
When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data
Genome Biology
author_facet Will P. M. Rowe
author_sort Will P. M. Rowe
title When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data
title_short When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data
title_full When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data
title_fullStr When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data
title_full_unstemmed When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data
title_sort when the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data
publisher BMC
series Genome Biology
issn 1474-760X
publishDate 2019-09-01
description Abstract Considerable advances in genomics over the past decade have resulted in vast amounts of data being generated and deposited in global archives. The growth of these archives exceeds our ability to process their content, leading to significant analysis bottlenecks. Sketching algorithms produce small, approximate summaries of data and have shown great utility in tackling this flood of genomic data, while using minimal compute resources. This article reviews the current state of the field, focusing on how the algorithms work and how genomicists can utilize them effectively. References to interactive workbooks for explaining concepts and demonstrating workflows are included at https://github.com/will-rowe/genome-sketching.
url http://link.springer.com/article/10.1186/s13059-019-1809-x
work_keys_str_mv AT willpmrowe whentheleveebreaksapracticalguidetosketchingalgorithmsforprocessingthefloodofgenomicdata
_version_ 1724536218645954560