Summary: | Introduction
With the explosion in data being collected and made available for research, linkage units receive an increasing amount of data. At the same time, researchers also expect access to more current data. This increase in the influx of data can create resource constraints for linkage units, which need to mobilise more staff time for data processing, as well as data custodians, who are required to provide data updates more frequently.
Objectives and Approach
SA NT DataLink has designed the Secure Automated File Exchange (SAFE), in collaboration with the University of South Australia. SAFE provides a framework to safely transfer encrypted data from custodians into SA NT DataLink’s systems. A given custodian uses one private key to send personally identifying data via Secure File Transfer Protocol (SFTP). This data flows via the university’s IT infrastructure, where it is checked for encryption, directly into a Demilitarised Zone (DMZ) within SA NT DataLink’s Data Linkage Unit’s (DLU) highly protected environment. The same custodian then uses a separate private key to provide the corresponding encrypted anonymised content data, again via SFTP. Given the less sensitive nature of this data type, it is deposited on secure university on-site storage, from where it is manually transferred by Data Integration Unit (DIU) staff to SA NT DataLink’s Custodian Controlled Data repository (CCDR).
Results
SA NT DataLink considers implementing SAFE with one data provider as a trial project. After successful testing, a rollout to other data custodians is possible. In parallel, alternative technical solutions for automated data transfers are being evaluated.
Conclusion / Implications
Automated data transfer solutions will reduce effort by data custodians to send data and for linkage units to receive and process data updates. Moreover, by limiting manual intervention, they will limit vulnerability to data privacy breaches and the risk of introducing errors into the data. However, data workflow automation is dependent on data provider requirements and the availability of resources to process received data.
|