Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study
Abstract Background When an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased. We investigated the extent of this bias and examined whether the bias can be reduced through in...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2017-12-01
|
Series: | Emerging Themes in Epidemiology |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12982-017-0068-0 |
id |
doaj-e9f0ee71f8254d0996778240ad915652 |
---|---|
record_format |
Article |
spelling |
doaj-e9f0ee71f8254d0996778240ad9156522020-11-24T21:49:14ZengBMCEmerging Themes in Epidemiology1742-76222017-12-0114111310.1186/s12982-017-0068-0Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation studyR. P. Cornish0J. Macleod1J. R. Carpenter2K. Tilling3Population Health Sciences, Bristol Medical School, University of BristolPopulation Health Sciences, Bristol Medical School, University of BristolDepartment of Medical Statistics, Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical MedicinePopulation Health Sciences, Bristol Medical School, University of BristolAbstract Background When an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased. We investigated the extent of this bias and examined whether the bias can be reduced through incorporating proxy outcomes obtained through linkage to administrative data as auxiliary variables in multiple imputation (MI). Methods Using data from the Avon Longitudinal Study of Parents and Children (ALSPAC) we estimated the association between breastfeeding and IQ (continuous outcome), incorporating linked attainment data (proxies for IQ) as auxiliary variables in MI models. Simulation studies explored the impact of varying the proportion of missing data (from 20 to 80%), the correlation between the outcome and its proxy (0.1–0.9), the strength of the missing data mechanism, and having a proxy variable that was incomplete. Results Incorporating a linked proxy for the missing outcome as an auxiliary variable reduced bias and increased efficiency in all scenarios, even when 80% of the outcome was missing. Using an incomplete proxy was similarly beneficial. High correlations (> 0.5) between the outcome and its proxy substantially reduced the missing information. Consistent with this, ALSPAC analysis showed inclusion of a proxy reduced bias and improved efficiency. Gains with additional proxies were modest. Conclusions In longitudinal studies with loss to follow-up, incorporating proxies for this study outcome obtained via linkage to external sources of data as auxiliary variables in MI models can give practically important bias reduction and efficiency gains when the study outcome is MNAR.http://link.springer.com/article/10.1186/s12982-017-0068-0Missing dataMultiple imputationBiasSimulation studyALSPACData linkage |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
R. P. Cornish J. Macleod J. R. Carpenter K. Tilling |
spellingShingle |
R. P. Cornish J. Macleod J. R. Carpenter K. Tilling Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study Emerging Themes in Epidemiology Missing data Multiple imputation Bias Simulation study ALSPAC Data linkage |
author_facet |
R. P. Cornish J. Macleod J. R. Carpenter K. Tilling |
author_sort |
R. P. Cornish |
title |
Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study |
title_short |
Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study |
title_full |
Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study |
title_fullStr |
Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study |
title_full_unstemmed |
Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study |
title_sort |
multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study |
publisher |
BMC |
series |
Emerging Themes in Epidemiology |
issn |
1742-7622 |
publishDate |
2017-12-01 |
description |
Abstract Background When an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased. We investigated the extent of this bias and examined whether the bias can be reduced through incorporating proxy outcomes obtained through linkage to administrative data as auxiliary variables in multiple imputation (MI). Methods Using data from the Avon Longitudinal Study of Parents and Children (ALSPAC) we estimated the association between breastfeeding and IQ (continuous outcome), incorporating linked attainment data (proxies for IQ) as auxiliary variables in MI models. Simulation studies explored the impact of varying the proportion of missing data (from 20 to 80%), the correlation between the outcome and its proxy (0.1–0.9), the strength of the missing data mechanism, and having a proxy variable that was incomplete. Results Incorporating a linked proxy for the missing outcome as an auxiliary variable reduced bias and increased efficiency in all scenarios, even when 80% of the outcome was missing. Using an incomplete proxy was similarly beneficial. High correlations (> 0.5) between the outcome and its proxy substantially reduced the missing information. Consistent with this, ALSPAC analysis showed inclusion of a proxy reduced bias and improved efficiency. Gains with additional proxies were modest. Conclusions In longitudinal studies with loss to follow-up, incorporating proxies for this study outcome obtained via linkage to external sources of data as auxiliary variables in MI models can give practically important bias reduction and efficiency gains when the study outcome is MNAR. |
topic |
Missing data Multiple imputation Bias Simulation study ALSPAC Data linkage |
url |
http://link.springer.com/article/10.1186/s12982-017-0068-0 |
work_keys_str_mv |
AT rpcornish multipleimputationusinglinkedproxyoutcomedataresultedinimportantbiasreductionandefficiencygainsasimulationstudy AT jmacleod multipleimputationusinglinkedproxyoutcomedataresultedinimportantbiasreductionandefficiencygainsasimulationstudy AT jrcarpenter multipleimputationusinglinkedproxyoutcomedataresultedinimportantbiasreductionandefficiencygainsasimulationstudy AT ktilling multipleimputationusinglinkedproxyoutcomedataresultedinimportantbiasreductionandefficiencygainsasimulationstudy |
_version_ |
1725888570348011520 |