Hapl-o-Mat: open-source software for HLA haplotype frequency estimation from ambiguous and heterogeneous data

Abstract Background Knowledge of HLA haplotypes is helpful in many settings as disease association studies, population genetics, or hematopoietic stem cell transplantation. Regarding the recruitment of unrelated hematopoietic stem cell donors, HLA haplotype frequencies of specific populations are us...

Full description

Bibliographic Details
Main Authors: Christian Schäfer, Alexander H. Schmidt, Jürgen Sauter
Format: Article
Language:English
Published: BMC 2017-05-01
Series:BMC Bioinformatics
Subjects:
HLA
Online Access:http://link.springer.com/article/10.1186/s12859-017-1692-y
id doaj-aa95d314915a46f9b0ba1ab844d2f1a5
record_format Article
spelling doaj-aa95d314915a46f9b0ba1ab844d2f1a52020-11-25T00:53:41ZengBMCBMC Bioinformatics1471-21052017-05-0118111010.1186/s12859-017-1692-yHapl-o-Mat: open-source software for HLA haplotype frequency estimation from ambiguous and heterogeneous dataChristian Schäfer0Alexander H. Schmidt1Jürgen Sauter2DKMS gemeinnützige GmbHDKMS gemeinnützige GmbHDKMS gemeinnützige GmbHAbstract Background Knowledge of HLA haplotypes is helpful in many settings as disease association studies, population genetics, or hematopoietic stem cell transplantation. Regarding the recruitment of unrelated hematopoietic stem cell donors, HLA haplotype frequencies of specific populations are used to optimize both donor searches for individual patients and strategic donor registry planning. However, the estimation of haplotype frequencies from HLA genotyping data is challenged by the large amount of genotype data, the complex HLA nomenclature, and the heterogeneous and ambiguous nature of typing records. Results To meet these challenges, we have developed the open-source software Hapl-o-Mat. It estimates haplotype frequencies from population data including an arbitrary number of loci using an expectation-maximization algorithm. Its key features are the processing of different HLA typing resolutions within a given population sample and the handling of ambiguities recorded via multiple allele codes or genotype list strings. Implemented in C++, Hapl-o-Mat facilitates efficient haplotype frequency estimation from large amounts of genotype data. We demonstrate its accuracy and performance on the basis of artificial and real genotype data. Conclusions Hapl-o-Mat is a versatile and efficient software for HLA haplotype frequency estimation. Its capability of processing various forms of HLA genotype data allows for a straightforward haplotype frequency estimation from typing records usually found in stem cell donor registries.http://link.springer.com/article/10.1186/s12859-017-1692-yHLAImmunogeneticsPopulation geneticsBioinformaticsHaplotypeExpectation-maximization algorithm
collection DOAJ
language English
format Article
sources DOAJ
author Christian Schäfer
Alexander H. Schmidt
Jürgen Sauter
spellingShingle Christian Schäfer
Alexander H. Schmidt
Jürgen Sauter
Hapl-o-Mat: open-source software for HLA haplotype frequency estimation from ambiguous and heterogeneous data
BMC Bioinformatics
HLA
Immunogenetics
Population genetics
Bioinformatics
Haplotype
Expectation-maximization algorithm
author_facet Christian Schäfer
Alexander H. Schmidt
Jürgen Sauter
author_sort Christian Schäfer
title Hapl-o-Mat: open-source software for HLA haplotype frequency estimation from ambiguous and heterogeneous data
title_short Hapl-o-Mat: open-source software for HLA haplotype frequency estimation from ambiguous and heterogeneous data
title_full Hapl-o-Mat: open-source software for HLA haplotype frequency estimation from ambiguous and heterogeneous data
title_fullStr Hapl-o-Mat: open-source software for HLA haplotype frequency estimation from ambiguous and heterogeneous data
title_full_unstemmed Hapl-o-Mat: open-source software for HLA haplotype frequency estimation from ambiguous and heterogeneous data
title_sort hapl-o-mat: open-source software for hla haplotype frequency estimation from ambiguous and heterogeneous data
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2017-05-01
description Abstract Background Knowledge of HLA haplotypes is helpful in many settings as disease association studies, population genetics, or hematopoietic stem cell transplantation. Regarding the recruitment of unrelated hematopoietic stem cell donors, HLA haplotype frequencies of specific populations are used to optimize both donor searches for individual patients and strategic donor registry planning. However, the estimation of haplotype frequencies from HLA genotyping data is challenged by the large amount of genotype data, the complex HLA nomenclature, and the heterogeneous and ambiguous nature of typing records. Results To meet these challenges, we have developed the open-source software Hapl-o-Mat. It estimates haplotype frequencies from population data including an arbitrary number of loci using an expectation-maximization algorithm. Its key features are the processing of different HLA typing resolutions within a given population sample and the handling of ambiguities recorded via multiple allele codes or genotype list strings. Implemented in C++, Hapl-o-Mat facilitates efficient haplotype frequency estimation from large amounts of genotype data. We demonstrate its accuracy and performance on the basis of artificial and real genotype data. Conclusions Hapl-o-Mat is a versatile and efficient software for HLA haplotype frequency estimation. Its capability of processing various forms of HLA genotype data allows for a straightforward haplotype frequency estimation from typing records usually found in stem cell donor registries.
topic HLA
Immunogenetics
Population genetics
Bioinformatics
Haplotype
Expectation-maximization algorithm
url http://link.springer.com/article/10.1186/s12859-017-1692-y
work_keys_str_mv AT christianschafer haplomatopensourcesoftwareforhlahaplotypefrequencyestimationfromambiguousandheterogeneousdata
AT alexanderhschmidt haplomatopensourcesoftwareforhlahaplotypefrequencyestimationfromambiguousandheterogeneousdata
AT jurgensauter haplomatopensourcesoftwareforhlahaplotypefrequencyestimationfromambiguousandheterogeneousdata
_version_ 1725237038074036224