Imperfect variables : the combined problem of missing data and mismeasured variables with application to generalized linear models
Observational studies predicated on the secondary use of information from administrative and health databases often encounter the problem of missing and mismeasured data. Although there is much methodological literature pertaining to each problem in isolation, there is a scant body of literature add...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
University of British Columbia
2009
|
Online Access: | http://hdl.handle.net/2429/15883 |
id |
ndltd-UBC-oai-circle.library.ubc.ca-2429-15883 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UBC-oai-circle.library.ubc.ca-2429-158832018-01-05T17:23:56Z Imperfect variables : the combined problem of missing data and mismeasured variables with application to generalized linear models Regier, Michael David Observational studies predicated on the secondary use of information from administrative and health databases often encounter the problem of missing and mismeasured data. Although there is much methodological literature pertaining to each problem in isolation, there is a scant body of literature addressing both problems in tandem. I investigate the effect of missing and mismeasured covariates on parameter estimation from a binary logistic regression model and propose a likelihood based method to adjust for the combined data deficiencies. Two simulation studies are used to understand the effect of data imperfection on parameter estimation and to evaluate the utility of a likelihood based adjustment. When missing and mismeasured data occurred for separate covariates, I found that the parameter estimate associated with the mismeasured portion was biased and that the parameter estimate for the missing data aspect may be biased under both missing at random and non-ignorable missing at random assumptions. A Monte Carlo Expectation-Maximization adjustment reduced the magnitude of the bias, but a trade-off was observed. Bias reduction for the mismeasured covariate was achieved by increasing the bias associated with the others. When both problems affected a single covariate, the parameter estimate for the imperfect covariate was biased. Additionally, the parameter estimates for the other covariates were also biased. The Monte Carlo Expectation-Maximization adjustment often corrected the bias, but the bias trade-off amongst the covariates was observed. For both simulation studies, I observed a potential dissimilarity across missing data mechanisms. A substantive data set was investigated and by using the second simulation study, which was structurally similar, I could provide reasonable conclusions about the nature of the estimates. Also, I could suggest avenues of research which would potentially minimize expenditures for additional high quality data. I conclude that the problem of imperfection may be addressed through standard statistical methodology, but that the known effects of missing data or measurement error may not manifest as expected when more general data imperfections are considered. Science, Faculty of Statistics, Department of Graduate 2009-11-27T19:34:21Z 2009-11-27T19:34:21Z 2009 2009-11 Text Thesis/Dissertation http://hdl.handle.net/2429/15883 eng Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ 4407561 bytes application/pdf University of British Columbia |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
description |
Observational studies predicated on the secondary use of information from
administrative and health databases often encounter the problem of missing and mismeasured data. Although there is much methodological literature pertaining to each problem in isolation, there is a scant body of literature addressing both problems in tandem. I investigate the effect of missing
and mismeasured covariates on parameter estimation from a binary logistic
regression model and propose a likelihood based method to adjust for the combined data deficiencies. Two simulation studies are used to understand the effect of data imperfection on parameter estimation and to evaluate the utility of a likelihood based adjustment. When missing and mismeasured data occurred for separate covariates, I
found that the parameter estimate associated with the mismeasured portion was biased and that the parameter estimate for the missing data aspect may be biased under both missing at random and non-ignorable missing at random assumptions. A Monte Carlo Expectation-Maximization adjustment
reduced the magnitude of the bias, but a trade-off was observed. Bias reduction for the mismeasured covariate was achieved by increasing the bias associated with the others. When both problems affected a single covariate, the parameter estimate for the imperfect covariate was biased. Additionally,
the parameter estimates for the other covariates were also biased. The Monte
Carlo Expectation-Maximization adjustment often corrected the bias, but the bias trade-off amongst the covariates was observed. For both simulation studies, I observed a potential dissimilarity across missing data mechanisms. A substantive data set was investigated and by using the second simulation study, which was structurally similar, I could provide reasonable
conclusions about the nature of the estimates. Also, I could suggest avenues of research which would potentially minimize expenditures for additional high quality data. I conclude that the problem of imperfection may be addressed through
standard statistical methodology, but that the known effects of missing data
or measurement error may not manifest as expected when more general data
imperfections are considered. === Science, Faculty of === Statistics, Department of === Graduate |
author |
Regier, Michael David |
spellingShingle |
Regier, Michael David Imperfect variables : the combined problem of missing data and mismeasured variables with application to generalized linear models |
author_facet |
Regier, Michael David |
author_sort |
Regier, Michael David |
title |
Imperfect variables : the combined problem of missing data and mismeasured variables with application to generalized linear models |
title_short |
Imperfect variables : the combined problem of missing data and mismeasured variables with application to generalized linear models |
title_full |
Imperfect variables : the combined problem of missing data and mismeasured variables with application to generalized linear models |
title_fullStr |
Imperfect variables : the combined problem of missing data and mismeasured variables with application to generalized linear models |
title_full_unstemmed |
Imperfect variables : the combined problem of missing data and mismeasured variables with application to generalized linear models |
title_sort |
imperfect variables : the combined problem of missing data and mismeasured variables with application to generalized linear models |
publisher |
University of British Columbia |
publishDate |
2009 |
url |
http://hdl.handle.net/2429/15883 |
work_keys_str_mv |
AT regiermichaeldavid imperfectvariablesthecombinedproblemofmissingdataandmismeasuredvariableswithapplicationtogeneralizedlinearmodels |
_version_ |
1718582279958167552 |