Analysing complex linked administrative data in health services research: Issues and solutions

Introduction Linked administrative data are increasingly being used to evaluate the impact of health policy on health-service use/cost because they can comprehensively capture whole of population interactions with the health system. These analyses are complex comprising unbalanced panels and are at...

Full description

Bibliographic Details
Main Authors: Rachael Rachael, David David, Mark Harris
Format: Article
Language:English
Published: Swansea University 2018-08-01
Series:International Journal of Population Data Science
Online Access:https://ijpds.org/article/view/641
id doaj-80b87b635d934855a27b4b2d105b4a1a
record_format Article
spelling doaj-80b87b635d934855a27b4b2d105b4a1a2020-11-25T02:09:25ZengSwansea UniversityInternational Journal of Population Data Science2399-49082018-08-013410.23889/ijpds.v3i4.641Analysing complex linked administrative data in health services research: Issues and solutionsRachael Rachael0David David1Mark Harris2Curtin UniversityCurtin UniversityCurtin University Introduction Linked administrative data are increasingly being used to evaluate the impact of health policy on health-service use/cost because they can comprehensively capture whole of population interactions with the health system. These analyses are complex comprising unbalanced panels and are at risk of endogeneity and associated problems. Objectives and Approach We evaluated the impact of changes in regularity of general practitioner contact on diabetes related hospitalisation before and after care coordination policies using whole of population, person-level linked primary care, hospital, Electoral Roll and death records. Complex panel random-effects modelling techniques were required due to the unbalanced structure of the data (individuals could exit and re-enter the study repeatedly), over-dispersion and high proportion of zeros, changes in availability of tests (ascertainment bias), the likelihood of prior health service use influencing the dependent variable (initial conditions and simultaneity/reverse causality bias) and likely correlation of observed and unobserved variables. Results Multivariable zero-inflated negative binomial and Cragg-hurdle clustered robust regression, which include separate components to model zero and non-zero outcomes, were required for these data. Mundlak variables (group-means of time-varying variables) were used to relax the assumption in the random-effects estimator that the observed variables were uncorrelated with the unobserved ones. Prior health service use was adjusted for using 4-year lags of GP contact and one-year lag of hospitalisation. The initial value of the dependent variable resolved the “initial condition” problem. Ascertainment bias was addressed using the number of years available for identification for each person as a covariate. AIC/BIC values were used to identify the best model. We found that more regular GP contact was associated with fewer hospitalisations, however this attenuated over time. Conclusion/Implications Availability of linked data, together with increases in computing power, has vastly increased its potential for use. This has also increased the complexity of analyses being undertaken necessitating recognizing and addressing problems, such as endogeneity, that arise due to the observational nature of the studies undertaken. https://ijpds.org/article/view/641
collection DOAJ
language English
format Article
sources DOAJ
author Rachael Rachael
David David
Mark Harris
spellingShingle Rachael Rachael
David David
Mark Harris
Analysing complex linked administrative data in health services research: Issues and solutions
International Journal of Population Data Science
author_facet Rachael Rachael
David David
Mark Harris
author_sort Rachael Rachael
title Analysing complex linked administrative data in health services research: Issues and solutions
title_short Analysing complex linked administrative data in health services research: Issues and solutions
title_full Analysing complex linked administrative data in health services research: Issues and solutions
title_fullStr Analysing complex linked administrative data in health services research: Issues and solutions
title_full_unstemmed Analysing complex linked administrative data in health services research: Issues and solutions
title_sort analysing complex linked administrative data in health services research: issues and solutions
publisher Swansea University
series International Journal of Population Data Science
issn 2399-4908
publishDate 2018-08-01
description Introduction Linked administrative data are increasingly being used to evaluate the impact of health policy on health-service use/cost because they can comprehensively capture whole of population interactions with the health system. These analyses are complex comprising unbalanced panels and are at risk of endogeneity and associated problems. Objectives and Approach We evaluated the impact of changes in regularity of general practitioner contact on diabetes related hospitalisation before and after care coordination policies using whole of population, person-level linked primary care, hospital, Electoral Roll and death records. Complex panel random-effects modelling techniques were required due to the unbalanced structure of the data (individuals could exit and re-enter the study repeatedly), over-dispersion and high proportion of zeros, changes in availability of tests (ascertainment bias), the likelihood of prior health service use influencing the dependent variable (initial conditions and simultaneity/reverse causality bias) and likely correlation of observed and unobserved variables. Results Multivariable zero-inflated negative binomial and Cragg-hurdle clustered robust regression, which include separate components to model zero and non-zero outcomes, were required for these data. Mundlak variables (group-means of time-varying variables) were used to relax the assumption in the random-effects estimator that the observed variables were uncorrelated with the unobserved ones. Prior health service use was adjusted for using 4-year lags of GP contact and one-year lag of hospitalisation. The initial value of the dependent variable resolved the “initial condition” problem. Ascertainment bias was addressed using the number of years available for identification for each person as a covariate. AIC/BIC values were used to identify the best model. We found that more regular GP contact was associated with fewer hospitalisations, however this attenuated over time. Conclusion/Implications Availability of linked data, together with increases in computing power, has vastly increased its potential for use. This has also increased the complexity of analyses being undertaken necessitating recognizing and addressing problems, such as endogeneity, that arise due to the observational nature of the studies undertaken.
url https://ijpds.org/article/view/641
work_keys_str_mv AT rachaelrachael analysingcomplexlinkedadministrativedatainhealthservicesresearchissuesandsolutions
AT daviddavid analysingcomplexlinkedadministrativedatainhealthservicesresearchissuesandsolutions
AT markharris analysingcomplexlinkedadministrativedatainhealthservicesresearchissuesandsolutions
_version_ 1724923987559972864