Summary: | This presentation will outline methods developed by the ABS to define populations within linked administrative data. The methods developed use a combination of both direct and indirect signals (referred to as ‘signs of life’) to infer whether individuals are in a particular population at a given point in time. Data from multiple government sources are used.
Introduction
Linked administrative datasets hold significant potential to unlock new insights and understanding about different populations of interest. A key challenge is to create a research dataset that is properly representative of the population of interest at a given point in time. Without a representative population to serve as the basis of analysis, research outcomes are harder to interpret and compare against those from other populations. This has flow on consequences for any research findings derived from linked admin data.
Objectives and Approach
This project sought to develop methods to define a representative Australian population from the Multi Agency Data Integration Project (MADIP) data asset. The core of the asset is created through a three-way linkage between Australian Medicare, Social Security and Taxation datasets. Together, these datasets have very high coverage of the Australian population and enable high quality linkage of other datasets into the asset. A ‘signs of life’ approach was used that sought to distinguish a representative population at a given point in time from the MADIP asset.
Results
An experimental representative population has been developed from the linkage spine that closely approximates the national age distribution and other breakdowns of the ABS’s Estimated Resident Population.
Conclusion / Implications
This work demonstrates one approach that can be used to derive useful analytical populations from linked datasets with overly exhaustive scopes.
|