Analysing complex linked administrative data in health services research: Issues and solutions
Introduction Linked administrative data are increasingly being used to evaluate the impact of health policy on health-service use/cost because they can comprehensively capture whole of population interactions with the health system. These analyses are complex comprising unbalanced panels and are at...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Swansea University
2018-08-01
|
Series: | International Journal of Population Data Science |
Online Access: | https://ijpds.org/article/view/641 |
id |
doaj-80b87b635d934855a27b4b2d105b4a1a |
---|---|
record_format |
Article |
spelling |
doaj-80b87b635d934855a27b4b2d105b4a1a2020-11-25T02:09:25ZengSwansea UniversityInternational Journal of Population Data Science2399-49082018-08-013410.23889/ijpds.v3i4.641Analysing complex linked administrative data in health services research: Issues and solutionsRachael Rachael0David David1Mark Harris2Curtin UniversityCurtin UniversityCurtin University Introduction Linked administrative data are increasingly being used to evaluate the impact of health policy on health-service use/cost because they can comprehensively capture whole of population interactions with the health system. These analyses are complex comprising unbalanced panels and are at risk of endogeneity and associated problems. Objectives and Approach We evaluated the impact of changes in regularity of general practitioner contact on diabetes related hospitalisation before and after care coordination policies using whole of population, person-level linked primary care, hospital, Electoral Roll and death records. Complex panel random-effects modelling techniques were required due to the unbalanced structure of the data (individuals could exit and re-enter the study repeatedly), over-dispersion and high proportion of zeros, changes in availability of tests (ascertainment bias), the likelihood of prior health service use influencing the dependent variable (initial conditions and simultaneity/reverse causality bias) and likely correlation of observed and unobserved variables. Results Multivariable zero-inflated negative binomial and Cragg-hurdle clustered robust regression, which include separate components to model zero and non-zero outcomes, were required for these data. Mundlak variables (group-means of time-varying variables) were used to relax the assumption in the random-effects estimator that the observed variables were uncorrelated with the unobserved ones. Prior health service use was adjusted for using 4-year lags of GP contact and one-year lag of hospitalisation. The initial value of the dependent variable resolved the “initial condition” problem. Ascertainment bias was addressed using the number of years available for identification for each person as a covariate. AIC/BIC values were used to identify the best model. We found that more regular GP contact was associated with fewer hospitalisations, however this attenuated over time. Conclusion/Implications Availability of linked data, together with increases in computing power, has vastly increased its potential for use. This has also increased the complexity of analyses being undertaken necessitating recognizing and addressing problems, such as endogeneity, that arise due to the observational nature of the studies undertaken. https://ijpds.org/article/view/641 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Rachael Rachael David David Mark Harris |
spellingShingle |
Rachael Rachael David David Mark Harris Analysing complex linked administrative data in health services research: Issues and solutions International Journal of Population Data Science |
author_facet |
Rachael Rachael David David Mark Harris |
author_sort |
Rachael Rachael |
title |
Analysing complex linked administrative data in health services research: Issues and solutions |
title_short |
Analysing complex linked administrative data in health services research: Issues and solutions |
title_full |
Analysing complex linked administrative data in health services research: Issues and solutions |
title_fullStr |
Analysing complex linked administrative data in health services research: Issues and solutions |
title_full_unstemmed |
Analysing complex linked administrative data in health services research: Issues and solutions |
title_sort |
analysing complex linked administrative data in health services research: issues and solutions |
publisher |
Swansea University |
series |
International Journal of Population Data Science |
issn |
2399-4908 |
publishDate |
2018-08-01 |
description |
Introduction
Linked administrative data are increasingly being used to evaluate the impact of health policy on health-service use/cost because they can comprehensively capture whole of population interactions with the health system. These analyses are complex comprising unbalanced panels and are at risk of endogeneity and associated problems.
Objectives and Approach
We evaluated the impact of changes in regularity of general practitioner contact on diabetes related hospitalisation before and after care coordination policies using whole of population, person-level linked primary care, hospital, Electoral Roll and death records. Complex panel random-effects modelling techniques were required due to the unbalanced structure of the data (individuals could exit and re-enter the study repeatedly), over-dispersion and high proportion of zeros, changes in availability of tests (ascertainment bias), the likelihood of prior health service use influencing the dependent variable (initial conditions and simultaneity/reverse causality bias) and likely correlation of observed and unobserved variables.
Results
Multivariable zero-inflated negative binomial and Cragg-hurdle clustered robust regression, which include separate components to model zero and non-zero outcomes, were required for these data. Mundlak variables (group-means of time-varying variables) were used to relax the assumption in the random-effects estimator that the observed variables were uncorrelated with the unobserved ones. Prior health service use was adjusted for using 4-year lags of GP contact and one-year lag of hospitalisation. The initial value of the dependent variable resolved the “initial condition” problem. Ascertainment bias was addressed using the number of years available for identification for each person as a covariate. AIC/BIC values were used to identify the best model. We found that more regular GP contact was associated with fewer hospitalisations, however this attenuated over time.
Conclusion/Implications
Availability of linked data, together with increases in computing power, has vastly increased its potential for use. This has also increased the complexity of analyses being undertaken necessitating recognizing and addressing problems, such as endogeneity, that arise due to the observational nature of the studies undertaken.
|
url |
https://ijpds.org/article/view/641 |
work_keys_str_mv |
AT rachaelrachael analysingcomplexlinkedadministrativedatainhealthservicesresearchissuesandsolutions AT daviddavid analysingcomplexlinkedadministrativedatainhealthservicesresearchissuesandsolutions AT markharris analysingcomplexlinkedadministrativedatainhealthservicesresearchissuesandsolutions |
_version_ |
1724923987559972864 |