Heckman imputation models for binary or continuous MNAR outcomes and MAR predictors

Abstract Background Multiple imputation by chained equations (MICE) requires specifying a suitable conditional imputation model for each incomplete variable and then iteratively imputes the missing values. In the presence of missing not at random (MNAR) outcomes, valid statistical inference often re...

Full description

Bibliographic Details
Main Authors: Jacques-Emmanuel Galimard, Sylvie Chevret, Emmanuel Curis, Matthieu Resche-Rigon
Format: Article
Language:English
Published: BMC 2018-08-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12874-018-0547-1
id doaj-e76ad7cce14c4ed4b8ab3e320d09a3b4
record_format Article
spelling doaj-e76ad7cce14c4ed4b8ab3e320d09a3b42020-11-25T01:19:20ZengBMCBMC Medical Research Methodology1471-22882018-08-0118111310.1186/s12874-018-0547-1Heckman imputation models for binary or continuous MNAR outcomes and MAR predictorsJacques-Emmanuel Galimard0Sylvie Chevret1Emmanuel Curis2Matthieu Resche-Rigon3INSERM U1153, Epidemiology and Biostatistics Sorbonne Paris Cité Research Center (CRESS), ECSTRA teamINSERM U1153, Epidemiology and Biostatistics Sorbonne Paris Cité Research Center (CRESS), ECSTRA teamINSERM UMR-S 1144, Équipe 1, Université Paris Descartes, Université Paris Diderot, Sorbonne Paris CitéINSERM U1153, Epidemiology and Biostatistics Sorbonne Paris Cité Research Center (CRESS), ECSTRA teamAbstract Background Multiple imputation by chained equations (MICE) requires specifying a suitable conditional imputation model for each incomplete variable and then iteratively imputes the missing values. In the presence of missing not at random (MNAR) outcomes, valid statistical inference often requires joint models for missing observations and their indicators of missingness. In this study, we derived an imputation model for missing binary data with MNAR mechanism from Heckman’s model using a one-step maximum likelihood estimator. We applied this approach to improve a previously developed approach for MNAR continuous outcomes using Heckman’s model and a two-step estimator. These models allow us to use a MICE process and can thus also handle missing at random (MAR) predictors in the same MICE process. Methods We simulated 1000 datasets of 500 cases. We generated the following missing data mechanisms on 30% of the outcomes: MAR mechanism, weak MNAR mechanism, and strong MNAR mechanism. We then resimulated the first three cases and added an additional 30% of MAR data on a predictor, resulting in 50% of complete cases. We evaluated and compared the performance of the developed approach to that of a complete case approach and classical Heckman’s model estimates. Results With MNAR outcomes, only methods using Heckman’s model were unbiased, and with a MAR predictor, the developed imputation approach outperformed all the other approaches. Conclusions In the presence of MAR predictors, we proposed a simple approach to address MNAR binary or continuous outcomes under a Heckman assumption in a MICE procedure.http://link.springer.com/article/10.1186/s12874-018-0547-1Heckman’s modelMissing dataMissing not at random (MNAR)Multiple imputation by chained equation (MICE)Sample selection method
collection DOAJ
language English
format Article
sources DOAJ
author Jacques-Emmanuel Galimard
Sylvie Chevret
Emmanuel Curis
Matthieu Resche-Rigon
spellingShingle Jacques-Emmanuel Galimard
Sylvie Chevret
Emmanuel Curis
Matthieu Resche-Rigon
Heckman imputation models for binary or continuous MNAR outcomes and MAR predictors
BMC Medical Research Methodology
Heckman’s model
Missing data
Missing not at random (MNAR)
Multiple imputation by chained equation (MICE)
Sample selection method
author_facet Jacques-Emmanuel Galimard
Sylvie Chevret
Emmanuel Curis
Matthieu Resche-Rigon
author_sort Jacques-Emmanuel Galimard
title Heckman imputation models for binary or continuous MNAR outcomes and MAR predictors
title_short Heckman imputation models for binary or continuous MNAR outcomes and MAR predictors
title_full Heckman imputation models for binary or continuous MNAR outcomes and MAR predictors
title_fullStr Heckman imputation models for binary or continuous MNAR outcomes and MAR predictors
title_full_unstemmed Heckman imputation models for binary or continuous MNAR outcomes and MAR predictors
title_sort heckman imputation models for binary or continuous mnar outcomes and mar predictors
publisher BMC
series BMC Medical Research Methodology
issn 1471-2288
publishDate 2018-08-01
description Abstract Background Multiple imputation by chained equations (MICE) requires specifying a suitable conditional imputation model for each incomplete variable and then iteratively imputes the missing values. In the presence of missing not at random (MNAR) outcomes, valid statistical inference often requires joint models for missing observations and their indicators of missingness. In this study, we derived an imputation model for missing binary data with MNAR mechanism from Heckman’s model using a one-step maximum likelihood estimator. We applied this approach to improve a previously developed approach for MNAR continuous outcomes using Heckman’s model and a two-step estimator. These models allow us to use a MICE process and can thus also handle missing at random (MAR) predictors in the same MICE process. Methods We simulated 1000 datasets of 500 cases. We generated the following missing data mechanisms on 30% of the outcomes: MAR mechanism, weak MNAR mechanism, and strong MNAR mechanism. We then resimulated the first three cases and added an additional 30% of MAR data on a predictor, resulting in 50% of complete cases. We evaluated and compared the performance of the developed approach to that of a complete case approach and classical Heckman’s model estimates. Results With MNAR outcomes, only methods using Heckman’s model were unbiased, and with a MAR predictor, the developed imputation approach outperformed all the other approaches. Conclusions In the presence of MAR predictors, we proposed a simple approach to address MNAR binary or continuous outcomes under a Heckman assumption in a MICE procedure.
topic Heckman’s model
Missing data
Missing not at random (MNAR)
Multiple imputation by chained equation (MICE)
Sample selection method
url http://link.springer.com/article/10.1186/s12874-018-0547-1
work_keys_str_mv AT jacquesemmanuelgalimard heckmanimputationmodelsforbinaryorcontinuousmnaroutcomesandmarpredictors
AT sylviechevret heckmanimputationmodelsforbinaryorcontinuousmnaroutcomesandmarpredictors
AT emmanuelcuris heckmanimputationmodelsforbinaryorcontinuousmnaroutcomesandmarpredictors
AT matthieurescherigon heckmanimputationmodelsforbinaryorcontinuousmnaroutcomesandmarpredictors
_version_ 1725138876073246720