Analyses and comparison of accuracy of different genotype imputation methods.

The power of genetic association analyses is often compromised by missing genotypic data which contributes to lack of significant findings, e.g., in in silico replication studies. One solution is to impute untyped SNPs from typed flanking markers, based on known linkage disequilibrium (LD) relations...

Full description

Bibliographic Details
Main Authors: Yu-Fang Pei, Jian Li, Lei Zhang, Christopher J Papasian, Hong-Wen Deng
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2008-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC2569208?pdf=render
id doaj-cd52b6139fc14b59a196efe98ed9558e
record_format Article
spelling doaj-cd52b6139fc14b59a196efe98ed9558e2020-11-25T01:48:14ZengPublic Library of Science (PLoS)PLoS ONE1932-62032008-01-01310e355110.1371/journal.pone.0003551Analyses and comparison of accuracy of different genotype imputation methods.Yu-Fang PeiJian LiLei ZhangChristopher J PapasianHong-Wen DengThe power of genetic association analyses is often compromised by missing genotypic data which contributes to lack of significant findings, e.g., in in silico replication studies. One solution is to impute untyped SNPs from typed flanking markers, based on known linkage disequilibrium (LD) relationships. Several imputation methods are available and their usefulness in association studies has been demonstrated, but factors affecting their relative performance in accuracy have not been systematically investigated. Therefore, we investigated and compared the performance of five popular genotype imputation methods, MACH, IMPUTE, fastPHASE, PLINK and Beagle, to assess and compare the effects of factors that affect imputation accuracy rates (ARs). Our results showed that a stronger LD and a lower MAF for an untyped marker produced better ARs for all the five methods. We also observed that a greater number of haplotypes in the reference sample resulted in higher ARs for MACH, IMPUTE, PLINK and Beagle, but had little influence on the ARs for fastPHASE. In general, MACH and IMPUTE produced similar results and these two methods consistently outperformed fastPHASE, PLINK and Beagle. Our study is helpful in guiding application of imputation methods in association analyses when genotype data are missing.http://europepmc.org/articles/PMC2569208?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Yu-Fang Pei
Jian Li
Lei Zhang
Christopher J Papasian
Hong-Wen Deng
spellingShingle Yu-Fang Pei
Jian Li
Lei Zhang
Christopher J Papasian
Hong-Wen Deng
Analyses and comparison of accuracy of different genotype imputation methods.
PLoS ONE
author_facet Yu-Fang Pei
Jian Li
Lei Zhang
Christopher J Papasian
Hong-Wen Deng
author_sort Yu-Fang Pei
title Analyses and comparison of accuracy of different genotype imputation methods.
title_short Analyses and comparison of accuracy of different genotype imputation methods.
title_full Analyses and comparison of accuracy of different genotype imputation methods.
title_fullStr Analyses and comparison of accuracy of different genotype imputation methods.
title_full_unstemmed Analyses and comparison of accuracy of different genotype imputation methods.
title_sort analyses and comparison of accuracy of different genotype imputation methods.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2008-01-01
description The power of genetic association analyses is often compromised by missing genotypic data which contributes to lack of significant findings, e.g., in in silico replication studies. One solution is to impute untyped SNPs from typed flanking markers, based on known linkage disequilibrium (LD) relationships. Several imputation methods are available and their usefulness in association studies has been demonstrated, but factors affecting their relative performance in accuracy have not been systematically investigated. Therefore, we investigated and compared the performance of five popular genotype imputation methods, MACH, IMPUTE, fastPHASE, PLINK and Beagle, to assess and compare the effects of factors that affect imputation accuracy rates (ARs). Our results showed that a stronger LD and a lower MAF for an untyped marker produced better ARs for all the five methods. We also observed that a greater number of haplotypes in the reference sample resulted in higher ARs for MACH, IMPUTE, PLINK and Beagle, but had little influence on the ARs for fastPHASE. In general, MACH and IMPUTE produced similar results and these two methods consistently outperformed fastPHASE, PLINK and Beagle. Our study is helpful in guiding application of imputation methods in association analyses when genotype data are missing.
url http://europepmc.org/articles/PMC2569208?pdf=render
work_keys_str_mv AT yufangpei analysesandcomparisonofaccuracyofdifferentgenotypeimputationmethods
AT jianli analysesandcomparisonofaccuracyofdifferentgenotypeimputationmethods
AT leizhang analysesandcomparisonofaccuracyofdifferentgenotypeimputationmethods
AT christopherjpapasian analysesandcomparisonofaccuracyofdifferentgenotypeimputationmethods
AT hongwendeng analysesandcomparisonofaccuracyofdifferentgenotypeimputationmethods
_version_ 1725012296813510656