Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research
<p>Abstract</p> <p>Background</p> <p>Multiple imputation is becoming increasingly popular. Theoretical considerations as well as simulation studies have shown that the inclusion of auxiliary variables is generally of benefit.</p> <p>Methods</p> <p&g...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2012-12-01
|
Series: | BMC Medical Research Methodology |
Subjects: | |
Online Access: | http://www.biomedcentral.com/1471-2288/12/184 |
id |
doaj-330c5e6b88b442af80a939e42b4ce5ca |
---|---|
record_format |
Article |
spelling |
doaj-330c5e6b88b442af80a939e42b4ce5ca2020-11-25T00:09:56ZengBMCBMC Medical Research Methodology1471-22882012-12-0112118410.1186/1471-2288-12-184Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample researchHardt JochenHerke MaxLeonhart Rainer<p>Abstract</p> <p>Background</p> <p>Multiple imputation is becoming increasingly popular. Theoretical considerations as well as simulation studies have shown that the inclusion of auxiliary variables is generally of benefit.</p> <p>Methods</p> <p>A simulation study of a linear regression with a response Y and two predictors X<sub>1</sub> and <it>X</it><sub>2</sub> was performed on data with n = 50, 100 and 200 using complete cases or multiple imputation with 0, 10, 20, 40 and 80 auxiliary variables. Mechanisms of missingness were either 100% MCAR or 50% MAR + 50% MCAR. Auxiliary variables had low (r=.10) vs. moderate correlations (r=.50) with X’s and Y.</p> <p>Results</p> <p>The inclusion of auxiliary variables can improve a multiple imputation model. However, inclusion of too many variables leads to downward bias of regression coefficients and decreases precision. When the correlations are low, inclusion of auxiliary variables is not useful.</p> <p>Conclusion</p> <p>More research on auxiliary variables in multiple imputation should be performed. A preliminary rule of thumb could be that the ratio of variables to cases with complete data should not go below 1 : 3.</p> http://www.biomedcentral.com/1471-2288/12/184Multiple imputationAuxiliary variablesSimulation studySmall and medium size samples |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Hardt Jochen Herke Max Leonhart Rainer |
spellingShingle |
Hardt Jochen Herke Max Leonhart Rainer Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research BMC Medical Research Methodology Multiple imputation Auxiliary variables Simulation study Small and medium size samples |
author_facet |
Hardt Jochen Herke Max Leonhart Rainer |
author_sort |
Hardt Jochen |
title |
Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research |
title_short |
Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research |
title_full |
Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research |
title_fullStr |
Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research |
title_full_unstemmed |
Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research |
title_sort |
auxiliary variables in multiple imputation in regression with missing x: a warning against including too many in small sample research |
publisher |
BMC |
series |
BMC Medical Research Methodology |
issn |
1471-2288 |
publishDate |
2012-12-01 |
description |
<p>Abstract</p> <p>Background</p> <p>Multiple imputation is becoming increasingly popular. Theoretical considerations as well as simulation studies have shown that the inclusion of auxiliary variables is generally of benefit.</p> <p>Methods</p> <p>A simulation study of a linear regression with a response Y and two predictors X<sub>1</sub> and <it>X</it><sub>2</sub> was performed on data with n = 50, 100 and 200 using complete cases or multiple imputation with 0, 10, 20, 40 and 80 auxiliary variables. Mechanisms of missingness were either 100% MCAR or 50% MAR + 50% MCAR. Auxiliary variables had low (r=.10) vs. moderate correlations (r=.50) with X’s and Y.</p> <p>Results</p> <p>The inclusion of auxiliary variables can improve a multiple imputation model. However, inclusion of too many variables leads to downward bias of regression coefficients and decreases precision. When the correlations are low, inclusion of auxiliary variables is not useful.</p> <p>Conclusion</p> <p>More research on auxiliary variables in multiple imputation should be performed. A preliminary rule of thumb could be that the ratio of variables to cases with complete data should not go below 1 : 3.</p> |
topic |
Multiple imputation Auxiliary variables Simulation study Small and medium size samples |
url |
http://www.biomedcentral.com/1471-2288/12/184 |
work_keys_str_mv |
AT hardtjochen auxiliaryvariablesinmultipleimputationinregressionwithmissingxawarningagainstincludingtoomanyinsmallsampleresearch AT herkemax auxiliaryvariablesinmultipleimputationinregressionwithmissingxawarningagainstincludingtoomanyinsmallsampleresearch AT leonhartrainer auxiliaryvariablesinmultipleimputationinregressionwithmissingxawarningagainstincludingtoomanyinsmallsampleresearch |
_version_ |
1725409944207884288 |