PPDP-PCAO: An Efficient High-Dimensional Data Releasing Method With Differential Privacy Protection

Privacy protection in data publishing is an extremely important issue that has been the focus of extensive research in recent years. However, the existing methods have a host of limitations, especially for high-dimensional data publishing. Aiming at the problem of poor availability of publishing res...

Full description

Bibliographic Details
Main Authors: Wanjie Li, Xing Zhang, Xiaohui Li, Guanghui Cao, Qingyun Zhang
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8924645/
id doaj-3d550a1acf9a4011a9d837414c32b16d
record_format Article
spelling doaj-3d550a1acf9a4011a9d837414c32b16d2021-03-30T00:28:27ZengIEEEIEEE Access2169-35362019-01-01717642917643710.1109/ACCESS.2019.29578588924645PPDP-PCAO: An Efficient High-Dimensional Data Releasing Method With Differential Privacy ProtectionWanjie Li0https://orcid.org/0000-0001-7084-4098Xing Zhang1https://orcid.org/0000-0003-4697-1681Xiaohui Li2https://orcid.org/0000-0002-1100-6550Guanghui Cao3https://orcid.org/0000-0002-8711-7687Qingyun Zhang4https://orcid.org/0000-0002-7815-8575School of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou, ChinaSchool of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou, ChinaSchool of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou, ChinaSchool of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou, ChinaSchool of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou, ChinaPrivacy protection in data publishing is an extremely important issue that has been the focus of extensive research in recent years. However, the existing methods have a host of limitations, especially for high-dimensional data publishing. Aiming at the problem of poor availability of publishing results caused by “the curse of dimensionality” in high-dimensional data publishing, we present PPDP-PCAO (Privacy Preserving Data Publishing with Principal Component Analysis Optimization) method, which can better address the problem of the lower utility of release results because of the high noise introduced by the curse of dimensionality. PPDP-PCAO improves the Principal Component Analysis (PCA) algorithm by employing the attribute importance, and reduces the dimension of the data with the improved PCA, which reduces the time and space cost. PPDP-PCAO introduces the evaluation mechanism based on mutual-information into data release, which evaluates the data generated by setting the different quantities of principal components to determine the optimal quantities. PPDP-PCAO considers the existence of multi-sensitive attributes in high-dimensional data, while the traditional methods of allocating privacy budgets cannot satisfy the personalized privacy protection. PPDP-PCAO introduces the sensitivity preference, combines the optimal matching theory, and designs the sensitive attribute hierarchical protection strategy. Extensive experimental results on different real datasets demonstrate that PPDP-PCAO not only guarantees the privacy of published dataset, but also significantly improves the accuracy and data utility than other high-dimensional data publishing methods.https://ieeexplore.ieee.org/document/8924645/Differential privacyevaluation mechanismhigh-dimensional datamutual informationprincipal component analysis optimization
collection DOAJ
language English
format Article
sources DOAJ
author Wanjie Li
Xing Zhang
Xiaohui Li
Guanghui Cao
Qingyun Zhang
spellingShingle Wanjie Li
Xing Zhang
Xiaohui Li
Guanghui Cao
Qingyun Zhang
PPDP-PCAO: An Efficient High-Dimensional Data Releasing Method With Differential Privacy Protection
IEEE Access
Differential privacy
evaluation mechanism
high-dimensional data
mutual information
principal component analysis optimization
author_facet Wanjie Li
Xing Zhang
Xiaohui Li
Guanghui Cao
Qingyun Zhang
author_sort Wanjie Li
title PPDP-PCAO: An Efficient High-Dimensional Data Releasing Method With Differential Privacy Protection
title_short PPDP-PCAO: An Efficient High-Dimensional Data Releasing Method With Differential Privacy Protection
title_full PPDP-PCAO: An Efficient High-Dimensional Data Releasing Method With Differential Privacy Protection
title_fullStr PPDP-PCAO: An Efficient High-Dimensional Data Releasing Method With Differential Privacy Protection
title_full_unstemmed PPDP-PCAO: An Efficient High-Dimensional Data Releasing Method With Differential Privacy Protection
title_sort ppdp-pcao: an efficient high-dimensional data releasing method with differential privacy protection
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description Privacy protection in data publishing is an extremely important issue that has been the focus of extensive research in recent years. However, the existing methods have a host of limitations, especially for high-dimensional data publishing. Aiming at the problem of poor availability of publishing results caused by “the curse of dimensionality” in high-dimensional data publishing, we present PPDP-PCAO (Privacy Preserving Data Publishing with Principal Component Analysis Optimization) method, which can better address the problem of the lower utility of release results because of the high noise introduced by the curse of dimensionality. PPDP-PCAO improves the Principal Component Analysis (PCA) algorithm by employing the attribute importance, and reduces the dimension of the data with the improved PCA, which reduces the time and space cost. PPDP-PCAO introduces the evaluation mechanism based on mutual-information into data release, which evaluates the data generated by setting the different quantities of principal components to determine the optimal quantities. PPDP-PCAO considers the existence of multi-sensitive attributes in high-dimensional data, while the traditional methods of allocating privacy budgets cannot satisfy the personalized privacy protection. PPDP-PCAO introduces the sensitivity preference, combines the optimal matching theory, and designs the sensitive attribute hierarchical protection strategy. Extensive experimental results on different real datasets demonstrate that PPDP-PCAO not only guarantees the privacy of published dataset, but also significantly improves the accuracy and data utility than other high-dimensional data publishing methods.
topic Differential privacy
evaluation mechanism
high-dimensional data
mutual information
principal component analysis optimization
url https://ieeexplore.ieee.org/document/8924645/
work_keys_str_mv AT wanjieli ppdppcaoanefficienthighdimensionaldatareleasingmethodwithdifferentialprivacyprotection
AT xingzhang ppdppcaoanefficienthighdimensionaldatareleasingmethodwithdifferentialprivacyprotection
AT xiaohuili ppdppcaoanefficienthighdimensionaldatareleasingmethodwithdifferentialprivacyprotection
AT guanghuicao ppdppcaoanefficienthighdimensionaldatareleasingmethodwithdifferentialprivacyprotection
AT qingyunzhang ppdppcaoanefficienthighdimensionaldatareleasingmethodwithdifferentialprivacyprotection
_version_ 1724188207366537216