Practical strategies for handling breakdown of multiple imputation procedures

Abstract Multiple imputation is a recommended method for handling incomplete data problems. One of the barriers to its successful use is the breakdown of the multiple imputation procedure, often due to numerical problems with the algorithms used within the imputation process. These problems frequent...

Full description

Bibliographic Details
Main Authors: Cattram D. Nguyen, John B. Carlin, Katherine J. Lee
Format: Article
Language:English
Published: BMC 2021-04-01
Series:Emerging Themes in Epidemiology
Subjects:
Online Access:https://doi.org/10.1186/s12982-021-00095-3
id doaj-063ca46f4da245e0bdce6c6992282d61
record_format Article
spelling doaj-063ca46f4da245e0bdce6c6992282d612021-04-04T11:03:48ZengBMCEmerging Themes in Epidemiology1742-76222021-04-011811810.1186/s12982-021-00095-3Practical strategies for handling breakdown of multiple imputation proceduresCattram D. Nguyen0John B. Carlin1Katherine J. Lee2Clinical Epidemiology and Biostatistics Unit, Murdoch Children’s Research Institute, The Royal Children’s HospitalClinical Epidemiology and Biostatistics Unit, Murdoch Children’s Research Institute, The Royal Children’s HospitalClinical Epidemiology and Biostatistics Unit, Murdoch Children’s Research Institute, The Royal Children’s HospitalAbstract Multiple imputation is a recommended method for handling incomplete data problems. One of the barriers to its successful use is the breakdown of the multiple imputation procedure, often due to numerical problems with the algorithms used within the imputation process. These problems frequently occur when imputation models contain large numbers of variables, especially with the popular approach of multivariate imputation by chained equations. This paper describes common causes of failure of the imputation procedure including perfect prediction and collinearity, focusing on issues when using Stata software. We outline a number of strategies for addressing these issues, including imputation of composite variables instead of individual components, introducing prior information and changing the form of the imputation model. These strategies are illustrated using a case study based on data from the Longitudinal Study of Australian Children.https://doi.org/10.1186/s12982-021-00095-3Auxiliary variablesCollinearityConvergenceMissing dataMultiple imputationMultivariate imputation by chained equations
collection DOAJ
language English
format Article
sources DOAJ
author Cattram D. Nguyen
John B. Carlin
Katherine J. Lee
spellingShingle Cattram D. Nguyen
John B. Carlin
Katherine J. Lee
Practical strategies for handling breakdown of multiple imputation procedures
Emerging Themes in Epidemiology
Auxiliary variables
Collinearity
Convergence
Missing data
Multiple imputation
Multivariate imputation by chained equations
author_facet Cattram D. Nguyen
John B. Carlin
Katherine J. Lee
author_sort Cattram D. Nguyen
title Practical strategies for handling breakdown of multiple imputation procedures
title_short Practical strategies for handling breakdown of multiple imputation procedures
title_full Practical strategies for handling breakdown of multiple imputation procedures
title_fullStr Practical strategies for handling breakdown of multiple imputation procedures
title_full_unstemmed Practical strategies for handling breakdown of multiple imputation procedures
title_sort practical strategies for handling breakdown of multiple imputation procedures
publisher BMC
series Emerging Themes in Epidemiology
issn 1742-7622
publishDate 2021-04-01
description Abstract Multiple imputation is a recommended method for handling incomplete data problems. One of the barriers to its successful use is the breakdown of the multiple imputation procedure, often due to numerical problems with the algorithms used within the imputation process. These problems frequently occur when imputation models contain large numbers of variables, especially with the popular approach of multivariate imputation by chained equations. This paper describes common causes of failure of the imputation procedure including perfect prediction and collinearity, focusing on issues when using Stata software. We outline a number of strategies for addressing these issues, including imputation of composite variables instead of individual components, introducing prior information and changing the form of the imputation model. These strategies are illustrated using a case study based on data from the Longitudinal Study of Australian Children.
topic Auxiliary variables
Collinearity
Convergence
Missing data
Multiple imputation
Multivariate imputation by chained equations
url https://doi.org/10.1186/s12982-021-00095-3
work_keys_str_mv AT cattramdnguyen practicalstrategiesforhandlingbreakdownofmultipleimputationprocedures
AT johnbcarlin practicalstrategiesforhandlingbreakdownofmultipleimputationprocedures
AT katherinejlee practicalstrategiesforhandlingbreakdownofmultipleimputationprocedures
_version_ 1721543042707488768