An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations
<p>Abstract</p> <p>Background</p> <p>Genome wide association studies (GWAS) are becoming the approach of choice to identify genetic determinants of complex phenotypes and common diseases. The astonishing amount of generated data and the use of distinct genotyping platfo...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2011-01-01
|
Series: | BMC Genetics |
Online Access: | http://www.biomedcentral.com/1471-2156/12/10 |
id |
doaj-d73a7dcdee454cfcb611782172d4351d |
---|---|
record_format |
Article |
spelling |
doaj-d73a7dcdee454cfcb611782172d4351d2020-11-25T03:10:54ZengBMCBMC Genetics1471-21562011-01-011211010.1186/1471-2156-12-10An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associationsPereira Alexandre CKrieger José EPereira Tiago VOliveira Paulo SLAlmeida Marcio AA<p>Abstract</p> <p>Background</p> <p>Genome wide association studies (GWAS) are becoming the approach of choice to identify genetic determinants of complex phenotypes and common diseases. The astonishing amount of generated data and the use of distinct genotyping platforms with variable genomic coverage are still analytical challenges. Imputation algorithms combine directly genotyped markers information with haplotypic structure for the population of interest for the inference of a badly genotyped or missing marker and are considered a near zero cost approach to allow the comparison and combination of data generated in different studies. Several reports stated that imputed markers have an overall acceptable accuracy but no published report has performed a pair wise comparison of imputed and empiric association statistics of a complete set of GWAS markers.</p> <p>Results</p> <p>In this report we identified a total of 73 imputed markers that yielded a nominally statistically significant association at <it>P </it>< 10 <sup>-5 </sup>for type 2 Diabetes Mellitus and compared them with results obtained based on empirical allelic frequencies. Interestingly, despite their overall high correlation, association statistics based on imputed frequencies were discordant in 35 of the 73 (47%) associated markers, considerably inflating the type I error rate of imputed markers. We comprehensively tested several quality thresholds, the haplotypic structure underlying imputed markers and the use of flanking markers as predictors of inaccurate association statistics derived from imputed markers.</p> <p>Conclusions</p> <p>Our results suggest that association statistics from imputed markers showing specific MAF (Minor Allele Frequencies) range, located in weak linkage disequilibrium blocks or strongly deviating from local patterns of association are prone to have inflated false positive association signals. The present study highlights the potential of imputation procedures and proposes simple procedures for selecting the best imputed markers for follow-up genotyping studies.</p> http://www.biomedcentral.com/1471-2156/12/10 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Pereira Alexandre C Krieger José E Pereira Tiago V Oliveira Paulo SL Almeida Marcio AA |
spellingShingle |
Pereira Alexandre C Krieger José E Pereira Tiago V Oliveira Paulo SL Almeida Marcio AA An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations BMC Genetics |
author_facet |
Pereira Alexandre C Krieger José E Pereira Tiago V Oliveira Paulo SL Almeida Marcio AA |
author_sort |
Pereira Alexandre C |
title |
An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations |
title_short |
An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations |
title_full |
An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations |
title_fullStr |
An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations |
title_full_unstemmed |
An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations |
title_sort |
empirical evaluation of imputation accuracy for association statistics reveals increased type-i error rates in genome-wide associations |
publisher |
BMC |
series |
BMC Genetics |
issn |
1471-2156 |
publishDate |
2011-01-01 |
description |
<p>Abstract</p> <p>Background</p> <p>Genome wide association studies (GWAS) are becoming the approach of choice to identify genetic determinants of complex phenotypes and common diseases. The astonishing amount of generated data and the use of distinct genotyping platforms with variable genomic coverage are still analytical challenges. Imputation algorithms combine directly genotyped markers information with haplotypic structure for the population of interest for the inference of a badly genotyped or missing marker and are considered a near zero cost approach to allow the comparison and combination of data generated in different studies. Several reports stated that imputed markers have an overall acceptable accuracy but no published report has performed a pair wise comparison of imputed and empiric association statistics of a complete set of GWAS markers.</p> <p>Results</p> <p>In this report we identified a total of 73 imputed markers that yielded a nominally statistically significant association at <it>P </it>< 10 <sup>-5 </sup>for type 2 Diabetes Mellitus and compared them with results obtained based on empirical allelic frequencies. Interestingly, despite their overall high correlation, association statistics based on imputed frequencies were discordant in 35 of the 73 (47%) associated markers, considerably inflating the type I error rate of imputed markers. We comprehensively tested several quality thresholds, the haplotypic structure underlying imputed markers and the use of flanking markers as predictors of inaccurate association statistics derived from imputed markers.</p> <p>Conclusions</p> <p>Our results suggest that association statistics from imputed markers showing specific MAF (Minor Allele Frequencies) range, located in weak linkage disequilibrium blocks or strongly deviating from local patterns of association are prone to have inflated false positive association signals. The present study highlights the potential of imputation procedures and proposes simple procedures for selecting the best imputed markers for follow-up genotyping studies.</p> |
url |
http://www.biomedcentral.com/1471-2156/12/10 |
work_keys_str_mv |
AT pereiraalexandrec anempiricalevaluationofimputationaccuracyforassociationstatisticsrevealsincreasedtypeierrorratesingenomewideassociations AT kriegerjosee anempiricalevaluationofimputationaccuracyforassociationstatisticsrevealsincreasedtypeierrorratesingenomewideassociations AT pereiratiagov anempiricalevaluationofimputationaccuracyforassociationstatisticsrevealsincreasedtypeierrorratesingenomewideassociations AT oliveirapaulosl anempiricalevaluationofimputationaccuracyforassociationstatisticsrevealsincreasedtypeierrorratesingenomewideassociations AT almeidamarcioaa anempiricalevaluationofimputationaccuracyforassociationstatisticsrevealsincreasedtypeierrorratesingenomewideassociations AT pereiraalexandrec empiricalevaluationofimputationaccuracyforassociationstatisticsrevealsincreasedtypeierrorratesingenomewideassociations AT kriegerjosee empiricalevaluationofimputationaccuracyforassociationstatisticsrevealsincreasedtypeierrorratesingenomewideassociations AT pereiratiagov empiricalevaluationofimputationaccuracyforassociationstatisticsrevealsincreasedtypeierrorratesingenomewideassociations AT oliveirapaulosl empiricalevaluationofimputationaccuracyforassociationstatisticsrevealsincreasedtypeierrorratesingenomewideassociations AT almeidamarcioaa empiricalevaluationofimputationaccuracyforassociationstatisticsrevealsincreasedtypeierrorratesingenomewideassociations |
_version_ |
1724656599789731840 |