End User Licence to Open Government Data? A Simulated Penetration Attack on Two Social Survey Datasets

In the UK, the transparency agenda is forcing data stewardship organisations to review their dissemination policies and to consider whether to release data that is currently only available to a restricted community of researchers under licence as open data. Here we describe the results of a study pr...

Full description

Bibliographic Details
Main Authors: Elliot Mark, Mackey Elaine, O’Shea Susan, Tudor Caroline, Spicer Keith
Format: Article
Language:English
Published: Sciendo 2016-06-01
Series:Journal of Official Statistics
Online Access:https://doi.org/10.1515/jos-2016-0019
id doaj-abc352e3e65e4d7882424f04318b6cad
record_format Article
spelling doaj-abc352e3e65e4d7882424f04318b6cad2021-09-06T19:40:51ZengSciendoJournal of Official Statistics2001-73672016-06-0132232934810.1515/jos-2016-0019jos-2016-0019End User Licence to Open Government Data? A Simulated Penetration Attack on Two Social Survey DatasetsElliot Mark0Mackey Elaine1O’Shea Susan2Tudor Caroline3Spicer Keith4 School of Social Sciences, University of Manchester, Manchester M13 9PL, UK. School of Social Sciences, University of Manchester, Manchester M13 9PL, UK. School of Social Sciences, University of Manchester, Manchester M13 9PL, UK. Office for National Statistics, Segensworth Road, Titchfield, Fareham, Hampshire, PO15 5RR, UK. Office for National Statistics, Segensworth Road, Titchfield, Fareham, Hampshire, PO15 5RR, UK.In the UK, the transparency agenda is forcing data stewardship organisations to review their dissemination policies and to consider whether to release data that is currently only available to a restricted community of researchers under licence as open data. Here we describe the results of a study providing evidence about the risks of such an approach via a simulated attack on two social survey datasets. This is also the first systematic attempt to simulate a jigsaw identification attack (one using a mashup of multiple data sources) on an anonymised dataset. The information that we draw on is collected from multiple online data sources and purchasable commercial data. The results indicate that such an attack against anonymised end user licence (EUL) datasets, if converted into open datasets, is possible and therefore we would recommend that penetration tests should be factored into any decision to make datasets (that are about people) open.https://doi.org/10.1515/jos-2016-0019
collection DOAJ
language English
format Article
sources DOAJ
author Elliot Mark
Mackey Elaine
O’Shea Susan
Tudor Caroline
Spicer Keith
spellingShingle Elliot Mark
Mackey Elaine
O’Shea Susan
Tudor Caroline
Spicer Keith
End User Licence to Open Government Data? A Simulated Penetration Attack on Two Social Survey Datasets
Journal of Official Statistics
author_facet Elliot Mark
Mackey Elaine
O’Shea Susan
Tudor Caroline
Spicer Keith
author_sort Elliot Mark
title End User Licence to Open Government Data? A Simulated Penetration Attack on Two Social Survey Datasets
title_short End User Licence to Open Government Data? A Simulated Penetration Attack on Two Social Survey Datasets
title_full End User Licence to Open Government Data? A Simulated Penetration Attack on Two Social Survey Datasets
title_fullStr End User Licence to Open Government Data? A Simulated Penetration Attack on Two Social Survey Datasets
title_full_unstemmed End User Licence to Open Government Data? A Simulated Penetration Attack on Two Social Survey Datasets
title_sort end user licence to open government data? a simulated penetration attack on two social survey datasets
publisher Sciendo
series Journal of Official Statistics
issn 2001-7367
publishDate 2016-06-01
description In the UK, the transparency agenda is forcing data stewardship organisations to review their dissemination policies and to consider whether to release data that is currently only available to a restricted community of researchers under licence as open data. Here we describe the results of a study providing evidence about the risks of such an approach via a simulated attack on two social survey datasets. This is also the first systematic attempt to simulate a jigsaw identification attack (one using a mashup of multiple data sources) on an anonymised dataset. The information that we draw on is collected from multiple online data sources and purchasable commercial data. The results indicate that such an attack against anonymised end user licence (EUL) datasets, if converted into open datasets, is possible and therefore we would recommend that penetration tests should be factored into any decision to make datasets (that are about people) open.
url https://doi.org/10.1515/jos-2016-0019
work_keys_str_mv AT elliotmark enduserlicencetoopengovernmentdataasimulatedpenetrationattackontwosocialsurveydatasets
AT mackeyelaine enduserlicencetoopengovernmentdataasimulatedpenetrationattackontwosocialsurveydatasets
AT osheasusan enduserlicencetoopengovernmentdataasimulatedpenetrationattackontwosocialsurveydatasets
AT tudorcaroline enduserlicencetoopengovernmentdataasimulatedpenetrationattackontwosocialsurveydatasets
AT spicerkeith enduserlicencetoopengovernmentdataasimulatedpenetrationattackontwosocialsurveydatasets
_version_ 1717767696428302336