Loqusdb: added value of an observations database of local genomic variation

Abstract Background Exome and genome sequencing is becoming the method of choice for rare disease diagnostics. One of the key challenges remaining is distinguishing the disease causing variants from the benign background variation. After analysis and annotation of the sequencing data there are typic...

Full description

Bibliographic Details
Main Authors: Måns Magnusson, Jesper Eisfeldt, Daniel Nilsson, Adam Rosenbaum, Valtteri Wirta, Anna Lindstrand, Anna Wedell, Henrik Stranneheim
Format: Article
Language:English
Published: BMC 2020-07-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-03609-z
id doaj-4e11b8c3f9e346468e404d1a4a66dfe4
record_format Article
spelling doaj-4e11b8c3f9e346468e404d1a4a66dfe42020-11-25T03:13:11ZengBMCBMC Bioinformatics1471-21052020-07-0121111010.1186/s12859-020-03609-zLoqusdb: added value of an observations database of local genomic variationMåns Magnusson0Jesper Eisfeldt1Daniel Nilsson2Adam Rosenbaum3Valtteri Wirta4Anna Lindstrand5Anna Wedell6Henrik Stranneheim7Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology, and Health, KTH Royal Institute of TechnologyDepartment of Clinical Genetics, Karolinska University HospitalDepartment of Clinical Genetics, Karolinska University HospitalScience for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology, and Health, KTH Royal Institute of TechnologyScience for Life Laboratory, Department of Microbiology, Tumor and Cell Biology, Karolinska InstitutetDepartment of Clinical Genetics, Karolinska University HospitalDepartment of Molecular Medicine and Surgery, Karolinska InstitutetDepartment of Molecular Medicine and Surgery, Karolinska InstitutetAbstract Background Exome and genome sequencing is becoming the method of choice for rare disease diagnostics. One of the key challenges remaining is distinguishing the disease causing variants from the benign background variation. After analysis and annotation of the sequencing data there are typically thousands of candidate variants requiring further investigation. One of the most effective and least biased ways to reduce this number is to assess the rarity of a variant in any population. Currently, there are a number of reliable sources of information for major population frequencies when considering single nucleotide variants (SNVs) and small insertion and deletions (INDELs), with gnomAD as the most prominent public resource available. However, local variation or frequencies in sub-populations may be underrepresented in these public resources. In contrast, for structural variation (SV), the background frequency in the general population is more or less unknown mostly due to challenges in calling SVs in a consistent way. Keeping track of local variation is one way to overcome these problems and significantly reduce the number of potential disease causing variants retained for manual inspection, both for SNVs and SVs. Results Here, we present loqusdb, a tool to solve the challenge of keeping track of any type of variant observations from genome sequencing data. Loqusdb was designed to handle a large flow of samples and unlike other solutions, samples can be added continuously to the database without rebuilding it, facilitating improvements and additions. We assessed the added value of a local observations database using 98 samples annotated with information from a background of 888 unrelated individuals. Conclusions We show both how powerful SV analysis can be when filtering for population frequencies and how the number of apparently rare SNVs/INDELs can be reduced by adding local population information even after annotating the data with other large frequency databases, such as gnomAD. In conclusion, we show that a local frequency database is an attractive, and a necessary addition to the publicly available databases that facilitate the analysis of exome and genome data in a clinical setting.http://link.springer.com/article/10.1186/s12859-020-03609-zGenomicsRare diseaseMendelianSingle nucleotide variantStructural variantPopulation frequency
collection DOAJ
language English
format Article
sources DOAJ
author Måns Magnusson
Jesper Eisfeldt
Daniel Nilsson
Adam Rosenbaum
Valtteri Wirta
Anna Lindstrand
Anna Wedell
Henrik Stranneheim
spellingShingle Måns Magnusson
Jesper Eisfeldt
Daniel Nilsson
Adam Rosenbaum
Valtteri Wirta
Anna Lindstrand
Anna Wedell
Henrik Stranneheim
Loqusdb: added value of an observations database of local genomic variation
BMC Bioinformatics
Genomics
Rare disease
Mendelian
Single nucleotide variant
Structural variant
Population frequency
author_facet Måns Magnusson
Jesper Eisfeldt
Daniel Nilsson
Adam Rosenbaum
Valtteri Wirta
Anna Lindstrand
Anna Wedell
Henrik Stranneheim
author_sort Måns Magnusson
title Loqusdb: added value of an observations database of local genomic variation
title_short Loqusdb: added value of an observations database of local genomic variation
title_full Loqusdb: added value of an observations database of local genomic variation
title_fullStr Loqusdb: added value of an observations database of local genomic variation
title_full_unstemmed Loqusdb: added value of an observations database of local genomic variation
title_sort loqusdb: added value of an observations database of local genomic variation
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2020-07-01
description Abstract Background Exome and genome sequencing is becoming the method of choice for rare disease diagnostics. One of the key challenges remaining is distinguishing the disease causing variants from the benign background variation. After analysis and annotation of the sequencing data there are typically thousands of candidate variants requiring further investigation. One of the most effective and least biased ways to reduce this number is to assess the rarity of a variant in any population. Currently, there are a number of reliable sources of information for major population frequencies when considering single nucleotide variants (SNVs) and small insertion and deletions (INDELs), with gnomAD as the most prominent public resource available. However, local variation or frequencies in sub-populations may be underrepresented in these public resources. In contrast, for structural variation (SV), the background frequency in the general population is more or less unknown mostly due to challenges in calling SVs in a consistent way. Keeping track of local variation is one way to overcome these problems and significantly reduce the number of potential disease causing variants retained for manual inspection, both for SNVs and SVs. Results Here, we present loqusdb, a tool to solve the challenge of keeping track of any type of variant observations from genome sequencing data. Loqusdb was designed to handle a large flow of samples and unlike other solutions, samples can be added continuously to the database without rebuilding it, facilitating improvements and additions. We assessed the added value of a local observations database using 98 samples annotated with information from a background of 888 unrelated individuals. Conclusions We show both how powerful SV analysis can be when filtering for population frequencies and how the number of apparently rare SNVs/INDELs can be reduced by adding local population information even after annotating the data with other large frequency databases, such as gnomAD. In conclusion, we show that a local frequency database is an attractive, and a necessary addition to the publicly available databases that facilitate the analysis of exome and genome data in a clinical setting.
topic Genomics
Rare disease
Mendelian
Single nucleotide variant
Structural variant
Population frequency
url http://link.springer.com/article/10.1186/s12859-020-03609-z
work_keys_str_mv AT mansmagnusson loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation
AT jespereisfeldt loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation
AT danielnilsson loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation
AT adamrosenbaum loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation
AT valtteriwirta loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation
AT annalindstrand loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation
AT annawedell loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation
AT henrikstranneheim loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation
_version_ 1724648239479652352