Haplotype inference based on Hidden Markov Models in the QTL-MAS 2010 multi-generational dataset

<p>Abstract</p> <p>Background</p> <p>We have previously demonstrated an approach for efficient computation of genotype probabilities, and more generally probabilities of allele inheritance in inbred as well as outbred populations. That work also included an extension fo...

Full description

Bibliographic Details
Main Author: Nettelblad Carl
Format: Article
Language:English
Published: BMC 2011-05-01
Series:BMC Proceedings
id doaj-f548fdd63db7488bb3e7daa4511de275
record_format Article
spelling doaj-f548fdd63db7488bb3e7daa4511de2752020-11-24T22:43:28ZengBMCBMC Proceedings1753-65612011-05-015Suppl 3S1010.1186/1753-6561-5-S3-S10Haplotype inference based on Hidden Markov Models in the QTL-MAS 2010 multi-generational datasetNettelblad Carl<p>Abstract</p> <p>Background</p> <p>We have previously demonstrated an approach for efficient computation of genotype probabilities, and more generally probabilities of allele inheritance in inbred as well as outbred populations. That work also included an extension for haplotype inference, or phasing, using Hidden Markov Models. Computational phasing of multi-thousand marker datasets has not become common as of yet. In this communication, we further investigate the method presented earlier for such problems, in a multi-generational dataset simulated for QTL detection.</p> <p>Results</p> <p>When analyzing the dataset simulated for the 14th QTLMAS workshop, the phasing produced showed zero deviations compared to original simulated phase in the founder generation. In total, 99.93% of all markers were correctly phased. 97.68% of the individuals were correct in all markers over all 5 simulated chromosomes. Results were produced over a weekend on a small computational cluster. The specific algorithmic adaptations needed for the Markov model training approach in order to reach convergence are described.</p> <p>Conclusions</p> <p>Our method provides efficient, near-perfect haplotype inference allowing the determination of completely phased genomes in dense pedigrees. These developments are of special value for applications where marker alleles are not corresponding directly to QTL alleles, thus necessitating tracking of allele origin, and in complex multi-generational crosses. The cnF2freq codebase, which is in a current state of active development, is available under a BSD-style license.</p>
collection DOAJ
language English
format Article
sources DOAJ
author Nettelblad Carl
spellingShingle Nettelblad Carl
Haplotype inference based on Hidden Markov Models in the QTL-MAS 2010 multi-generational dataset
BMC Proceedings
author_facet Nettelblad Carl
author_sort Nettelblad Carl
title Haplotype inference based on Hidden Markov Models in the QTL-MAS 2010 multi-generational dataset
title_short Haplotype inference based on Hidden Markov Models in the QTL-MAS 2010 multi-generational dataset
title_full Haplotype inference based on Hidden Markov Models in the QTL-MAS 2010 multi-generational dataset
title_fullStr Haplotype inference based on Hidden Markov Models in the QTL-MAS 2010 multi-generational dataset
title_full_unstemmed Haplotype inference based on Hidden Markov Models in the QTL-MAS 2010 multi-generational dataset
title_sort haplotype inference based on hidden markov models in the qtl-mas 2010 multi-generational dataset
publisher BMC
series BMC Proceedings
issn 1753-6561
publishDate 2011-05-01
description <p>Abstract</p> <p>Background</p> <p>We have previously demonstrated an approach for efficient computation of genotype probabilities, and more generally probabilities of allele inheritance in inbred as well as outbred populations. That work also included an extension for haplotype inference, or phasing, using Hidden Markov Models. Computational phasing of multi-thousand marker datasets has not become common as of yet. In this communication, we further investigate the method presented earlier for such problems, in a multi-generational dataset simulated for QTL detection.</p> <p>Results</p> <p>When analyzing the dataset simulated for the 14th QTLMAS workshop, the phasing produced showed zero deviations compared to original simulated phase in the founder generation. In total, 99.93% of all markers were correctly phased. 97.68% of the individuals were correct in all markers over all 5 simulated chromosomes. Results were produced over a weekend on a small computational cluster. The specific algorithmic adaptations needed for the Markov model training approach in order to reach convergence are described.</p> <p>Conclusions</p> <p>Our method provides efficient, near-perfect haplotype inference allowing the determination of completely phased genomes in dense pedigrees. These developments are of special value for applications where marker alleles are not corresponding directly to QTL alleles, thus necessitating tracking of allele origin, and in complex multi-generational crosses. The cnF2freq codebase, which is in a current state of active development, is available under a BSD-style license.</p>
work_keys_str_mv AT nettelbladcarl haplotypeinferencebasedonhiddenmarkovmodelsintheqtlmas2010multigenerationaldataset
_version_ 1725695668442365952