Loqusdb: added value of an observations database of local genomic variation
Abstract Background Exome and genome sequencing is becoming the method of choice for rare disease diagnostics. One of the key challenges remaining is distinguishing the disease causing variants from the benign background variation. After analysis and annotation of the sequencing data there are typic...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2020-07-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-020-03609-z |
id |
doaj-4e11b8c3f9e346468e404d1a4a66dfe4 |
---|---|
record_format |
Article |
spelling |
doaj-4e11b8c3f9e346468e404d1a4a66dfe42020-11-25T03:13:11ZengBMCBMC Bioinformatics1471-21052020-07-0121111010.1186/s12859-020-03609-zLoqusdb: added value of an observations database of local genomic variationMåns Magnusson0Jesper Eisfeldt1Daniel Nilsson2Adam Rosenbaum3Valtteri Wirta4Anna Lindstrand5Anna Wedell6Henrik Stranneheim7Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology, and Health, KTH Royal Institute of TechnologyDepartment of Clinical Genetics, Karolinska University HospitalDepartment of Clinical Genetics, Karolinska University HospitalScience for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology, and Health, KTH Royal Institute of TechnologyScience for Life Laboratory, Department of Microbiology, Tumor and Cell Biology, Karolinska InstitutetDepartment of Clinical Genetics, Karolinska University HospitalDepartment of Molecular Medicine and Surgery, Karolinska InstitutetDepartment of Molecular Medicine and Surgery, Karolinska InstitutetAbstract Background Exome and genome sequencing is becoming the method of choice for rare disease diagnostics. One of the key challenges remaining is distinguishing the disease causing variants from the benign background variation. After analysis and annotation of the sequencing data there are typically thousands of candidate variants requiring further investigation. One of the most effective and least biased ways to reduce this number is to assess the rarity of a variant in any population. Currently, there are a number of reliable sources of information for major population frequencies when considering single nucleotide variants (SNVs) and small insertion and deletions (INDELs), with gnomAD as the most prominent public resource available. However, local variation or frequencies in sub-populations may be underrepresented in these public resources. In contrast, for structural variation (SV), the background frequency in the general population is more or less unknown mostly due to challenges in calling SVs in a consistent way. Keeping track of local variation is one way to overcome these problems and significantly reduce the number of potential disease causing variants retained for manual inspection, both for SNVs and SVs. Results Here, we present loqusdb, a tool to solve the challenge of keeping track of any type of variant observations from genome sequencing data. Loqusdb was designed to handle a large flow of samples and unlike other solutions, samples can be added continuously to the database without rebuilding it, facilitating improvements and additions. We assessed the added value of a local observations database using 98 samples annotated with information from a background of 888 unrelated individuals. Conclusions We show both how powerful SV analysis can be when filtering for population frequencies and how the number of apparently rare SNVs/INDELs can be reduced by adding local population information even after annotating the data with other large frequency databases, such as gnomAD. In conclusion, we show that a local frequency database is an attractive, and a necessary addition to the publicly available databases that facilitate the analysis of exome and genome data in a clinical setting.http://link.springer.com/article/10.1186/s12859-020-03609-zGenomicsRare diseaseMendelianSingle nucleotide variantStructural variantPopulation frequency |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Måns Magnusson Jesper Eisfeldt Daniel Nilsson Adam Rosenbaum Valtteri Wirta Anna Lindstrand Anna Wedell Henrik Stranneheim |
spellingShingle |
Måns Magnusson Jesper Eisfeldt Daniel Nilsson Adam Rosenbaum Valtteri Wirta Anna Lindstrand Anna Wedell Henrik Stranneheim Loqusdb: added value of an observations database of local genomic variation BMC Bioinformatics Genomics Rare disease Mendelian Single nucleotide variant Structural variant Population frequency |
author_facet |
Måns Magnusson Jesper Eisfeldt Daniel Nilsson Adam Rosenbaum Valtteri Wirta Anna Lindstrand Anna Wedell Henrik Stranneheim |
author_sort |
Måns Magnusson |
title |
Loqusdb: added value of an observations database of local genomic variation |
title_short |
Loqusdb: added value of an observations database of local genomic variation |
title_full |
Loqusdb: added value of an observations database of local genomic variation |
title_fullStr |
Loqusdb: added value of an observations database of local genomic variation |
title_full_unstemmed |
Loqusdb: added value of an observations database of local genomic variation |
title_sort |
loqusdb: added value of an observations database of local genomic variation |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2020-07-01 |
description |
Abstract Background Exome and genome sequencing is becoming the method of choice for rare disease diagnostics. One of the key challenges remaining is distinguishing the disease causing variants from the benign background variation. After analysis and annotation of the sequencing data there are typically thousands of candidate variants requiring further investigation. One of the most effective and least biased ways to reduce this number is to assess the rarity of a variant in any population. Currently, there are a number of reliable sources of information for major population frequencies when considering single nucleotide variants (SNVs) and small insertion and deletions (INDELs), with gnomAD as the most prominent public resource available. However, local variation or frequencies in sub-populations may be underrepresented in these public resources. In contrast, for structural variation (SV), the background frequency in the general population is more or less unknown mostly due to challenges in calling SVs in a consistent way. Keeping track of local variation is one way to overcome these problems and significantly reduce the number of potential disease causing variants retained for manual inspection, both for SNVs and SVs. Results Here, we present loqusdb, a tool to solve the challenge of keeping track of any type of variant observations from genome sequencing data. Loqusdb was designed to handle a large flow of samples and unlike other solutions, samples can be added continuously to the database without rebuilding it, facilitating improvements and additions. We assessed the added value of a local observations database using 98 samples annotated with information from a background of 888 unrelated individuals. Conclusions We show both how powerful SV analysis can be when filtering for population frequencies and how the number of apparently rare SNVs/INDELs can be reduced by adding local population information even after annotating the data with other large frequency databases, such as gnomAD. In conclusion, we show that a local frequency database is an attractive, and a necessary addition to the publicly available databases that facilitate the analysis of exome and genome data in a clinical setting. |
topic |
Genomics Rare disease Mendelian Single nucleotide variant Structural variant Population frequency |
url |
http://link.springer.com/article/10.1186/s12859-020-03609-z |
work_keys_str_mv |
AT mansmagnusson loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation AT jespereisfeldt loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation AT danielnilsson loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation AT adamrosenbaum loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation AT valtteriwirta loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation AT annalindstrand loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation AT annawedell loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation AT henrikstranneheim loqusdbaddedvalueofanobservationsdatabaseoflocalgenomicvariation |
_version_ |
1724648239479652352 |