A fast algorithm for genome-wide haplotype pattern mining

<p>Abstract</p> <p>Background</p> <p>Identifying the genetic components of common diseases has long been an important area of research. Recently, genotyping technology has reached the level where it is cost effective to genotype single nucleotide polymorphism (SNP) mark...

Full description

Bibliographic Details
Main Authors: Pedersen Christian NS, Besenbacher Søren, Mailund Thomas
Format: Article
Language:English
Published: BMC 2009-01-01
Series:BMC Bioinformatics
id doaj-976277da56e24075a896fd4186c78edd
record_format Article
spelling doaj-976277da56e24075a896fd4186c78edd2020-11-25T02:46:16ZengBMCBMC Bioinformatics1471-21052009-01-0110Suppl 1S7410.1186/1471-2105-10-S1-S74A fast algorithm for genome-wide haplotype pattern miningPedersen Christian NSBesenbacher SørenMailund Thomas<p>Abstract</p> <p>Background</p> <p>Identifying the genetic components of common diseases has long been an important area of research. Recently, genotyping technology has reached the level where it is cost effective to genotype single nucleotide polymorphism (SNP) markers covering the entire genome, in thousands of individuals, and analyse such data for markers associated with a diseases. The statistical power to detect association, however, is limited when markers are analysed one at a time. This can be alleviated by considering multiple markers simultaneously. The <it>Haplotype Pattern Mining </it>(HPM) method is a machine learning approach to do exactly this.</p> <p>Results</p> <p>We present a new, faster algorithm for the HPM method. The new approach use patterns of haplotype diversity in the genome: locally in the genome, the number of observed haplotypes is much smaller than the total number of possible haplotypes. We show that the new approach speeds up the HPM method with a factor of 2 on a genome-wide dataset with 5009 individuals typed in 491208 markers using default parameters and more if the pattern length is increased.</p> <p>Conclusion</p> <p>The new algorithm speeds up the HPM method and we show that it is feasible to apply HPM to whole genome association mapping with thousands of individuals and hundreds of thousands of markers.</p>
collection DOAJ
language English
format Article
sources DOAJ
author Pedersen Christian NS
Besenbacher Søren
Mailund Thomas
spellingShingle Pedersen Christian NS
Besenbacher Søren
Mailund Thomas
A fast algorithm for genome-wide haplotype pattern mining
BMC Bioinformatics
author_facet Pedersen Christian NS
Besenbacher Søren
Mailund Thomas
author_sort Pedersen Christian NS
title A fast algorithm for genome-wide haplotype pattern mining
title_short A fast algorithm for genome-wide haplotype pattern mining
title_full A fast algorithm for genome-wide haplotype pattern mining
title_fullStr A fast algorithm for genome-wide haplotype pattern mining
title_full_unstemmed A fast algorithm for genome-wide haplotype pattern mining
title_sort fast algorithm for genome-wide haplotype pattern mining
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2009-01-01
description <p>Abstract</p> <p>Background</p> <p>Identifying the genetic components of common diseases has long been an important area of research. Recently, genotyping technology has reached the level where it is cost effective to genotype single nucleotide polymorphism (SNP) markers covering the entire genome, in thousands of individuals, and analyse such data for markers associated with a diseases. The statistical power to detect association, however, is limited when markers are analysed one at a time. This can be alleviated by considering multiple markers simultaneously. The <it>Haplotype Pattern Mining </it>(HPM) method is a machine learning approach to do exactly this.</p> <p>Results</p> <p>We present a new, faster algorithm for the HPM method. The new approach use patterns of haplotype diversity in the genome: locally in the genome, the number of observed haplotypes is much smaller than the total number of possible haplotypes. We show that the new approach speeds up the HPM method with a factor of 2 on a genome-wide dataset with 5009 individuals typed in 491208 markers using default parameters and more if the pattern length is increased.</p> <p>Conclusion</p> <p>The new algorithm speeds up the HPM method and we show that it is feasible to apply HPM to whole genome association mapping with thousands of individuals and hundreds of thousands of markers.</p>
work_keys_str_mv AT pedersenchristianns afastalgorithmforgenomewidehaplotypepatternmining
AT besenbachersøren afastalgorithmforgenomewidehaplotypepatternmining
AT mailundthomas afastalgorithmforgenomewidehaplotypepatternmining
AT pedersenchristianns fastalgorithmforgenomewidehaplotypepatternmining
AT besenbachersøren fastalgorithmforgenomewidehaplotypepatternmining
AT mailundthomas fastalgorithmforgenomewidehaplotypepatternmining
_version_ 1724759513904447488