Efficient L-Diversity Algorithm for Preserving Privacy of Dynamically Published Datasets

Although most conventional methods of preserving data privacy focus on static datasets, which remain unchanged after processing, real-world datasets may be dynamically modified often. Therefore, privacy-preservation methods must maintain data privacy after dataset modification. Re-anonymization of e...

Full description

Bibliographic Details
Main Authors: Odsuren Temuujin, Jinhyun Ahn, Dong-Hyuk Im
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8805309/
Description
Summary:Although most conventional methods of preserving data privacy focus on static datasets, which remain unchanged after processing, real-world datasets may be dynamically modified often. Therefore, privacy-preservation methods must maintain data privacy after dataset modification. Re-anonymization of entire datasets is inefficient when large datasets are frequently modified. Although several previous studies have addressed data privacy for incremental data updates (i.e., record insertions), they have not adequately it for dynamic changes made to existing datasets (i.e., record updates and deletions). Therefore, we identified limitations of data-privacy preservation for dynamically evolving datasets and used anatomy instead of generalization and suppression to develop a more efficient l-diversity algorithm for preserving privacy of such datasets. We also used a Cuckoo filter, a new probabilistic data structure for approximate set-membership tests, to improve data-processing efficiency. Experimental results demonstrated that our proposed data-anonymization algorithm processed data more efficiently than other conventional algorithms, requiring much less running time than conventional re-anonymization of entire datasets. The Cuckoo-filtered algorithm was especially efficient, dramatically reducing operation execution times while maintaining privacy of dynamically evolving datasets.
ISSN:2169-3536