Summary: | Observational studies predicated on the secondary use of information from
administrative and health databases often encounter the problem of missing and mismeasured data. Although there is much methodological literature pertaining to each problem in isolation, there is a scant body of literature addressing both problems in tandem. I investigate the effect of missing
and mismeasured covariates on parameter estimation from a binary logistic
regression model and propose a likelihood based method to adjust for the combined data deficiencies. Two simulation studies are used to understand the effect of data imperfection on parameter estimation and to evaluate the utility of a likelihood based adjustment. When missing and mismeasured data occurred for separate covariates, I
found that the parameter estimate associated with the mismeasured portion was biased and that the parameter estimate for the missing data aspect may be biased under both missing at random and non-ignorable missing at random assumptions. A Monte Carlo Expectation-Maximization adjustment
reduced the magnitude of the bias, but a trade-off was observed. Bias reduction for the mismeasured covariate was achieved by increasing the bias associated with the others. When both problems affected a single covariate, the parameter estimate for the imperfect covariate was biased. Additionally,
the parameter estimates for the other covariates were also biased. The Monte
Carlo Expectation-Maximization adjustment often corrected the bias, but the bias trade-off amongst the covariates was observed. For both simulation studies, I observed a potential dissimilarity across missing data mechanisms. A substantive data set was investigated and by using the second simulation study, which was structurally similar, I could provide reasonable
conclusions about the nature of the estimates. Also, I could suggest avenues of research which would potentially minimize expenditures for additional high quality data. I conclude that the problem of imperfection may be addressed through
standard statistical methodology, but that the known effects of missing data
or measurement error may not manifest as expected when more general data
imperfections are considered. === Science, Faculty of === Statistics, Department of === Graduate
|