Recursive Genetic Micro-Aggregation Technique: Information Loss, Disclosure Risk and Scoring Index
This research investigates the micro-aggregation problem in secure statistical databases by integrating the divide and conquer concept with a genetic algorithm. This is achieved by recursively dividing a micro-data set into two subsets based on the proximity distance similarity. On each subset the g...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-05-01
|
Series: | Data |
Subjects: | |
Online Access: | https://www.mdpi.com/2306-5729/6/5/53 |
id |
doaj-a5a34d1728b54d60b6a98373fea5dc5b |
---|---|
record_format |
Article |
spelling |
doaj-a5a34d1728b54d60b6a98373fea5dc5b2021-06-01T00:36:18ZengMDPI AGData2306-57292021-05-016535310.3390/data6050053Recursive Genetic Micro-Aggregation Technique: Information Loss, Disclosure Risk and Scoring IndexEbaa Fayyoumi0Omar Alhuniti1Department of Computer Science, The Hashemite University, Zarqa 13115, JordanDepartment of Antiquities, Amman 11118, JordanThis research investigates the micro-aggregation problem in secure statistical databases by integrating the divide and conquer concept with a genetic algorithm. This is achieved by recursively dividing a micro-data set into two subsets based on the proximity distance similarity. On each subset the genetic operation “crossover” is performed until the convergence condition is satisfied. The recursion will be terminated if the size of the generated subset is satisfied. Eventually, the genetic operation “mutation” will be performed over all generated subsets that satisfied the variable group size constraint in order to maximize the objective function. Experimentally, the proposed micro-aggregation technique was applied to recommended real-life data sets. Results demonstrated a remarkable reduction in the computational time, which sometimes exceeded 70% compared to the state-of-the-art. Furthermore, a good equilibrium value of the Scoring Index <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>(</mo><mi>S</mi><mi>I</mi><mo>)</mo></mrow></semantics></math></inline-formula> was achieved by involving a linear combination of the General Information Loss <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>(</mo><msub><mi>G</mi><mrow><mi>I</mi><mi>L</mi></mrow></msub><mo>)</mo></mrow></semantics></math></inline-formula> and the General Disclosure Risk <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>(</mo><msub><mi>G</mi><mrow><mi>D</mi><mi>R</mi></mrow></msub><mo>)</mo></mrow></semantics></math></inline-formula>.https://www.mdpi.com/2306-5729/6/5/53micro-aggregation techniquesgenetic algorithmsecure statistical databasesinformation lossdisclosure risk |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ebaa Fayyoumi Omar Alhuniti |
spellingShingle |
Ebaa Fayyoumi Omar Alhuniti Recursive Genetic Micro-Aggregation Technique: Information Loss, Disclosure Risk and Scoring Index Data micro-aggregation techniques genetic algorithm secure statistical databases information loss disclosure risk |
author_facet |
Ebaa Fayyoumi Omar Alhuniti |
author_sort |
Ebaa Fayyoumi |
title |
Recursive Genetic Micro-Aggregation Technique: Information Loss, Disclosure Risk and Scoring Index |
title_short |
Recursive Genetic Micro-Aggregation Technique: Information Loss, Disclosure Risk and Scoring Index |
title_full |
Recursive Genetic Micro-Aggregation Technique: Information Loss, Disclosure Risk and Scoring Index |
title_fullStr |
Recursive Genetic Micro-Aggregation Technique: Information Loss, Disclosure Risk and Scoring Index |
title_full_unstemmed |
Recursive Genetic Micro-Aggregation Technique: Information Loss, Disclosure Risk and Scoring Index |
title_sort |
recursive genetic micro-aggregation technique: information loss, disclosure risk and scoring index |
publisher |
MDPI AG |
series |
Data |
issn |
2306-5729 |
publishDate |
2021-05-01 |
description |
This research investigates the micro-aggregation problem in secure statistical databases by integrating the divide and conquer concept with a genetic algorithm. This is achieved by recursively dividing a micro-data set into two subsets based on the proximity distance similarity. On each subset the genetic operation “crossover” is performed until the convergence condition is satisfied. The recursion will be terminated if the size of the generated subset is satisfied. Eventually, the genetic operation “mutation” will be performed over all generated subsets that satisfied the variable group size constraint in order to maximize the objective function. Experimentally, the proposed micro-aggregation technique was applied to recommended real-life data sets. Results demonstrated a remarkable reduction in the computational time, which sometimes exceeded 70% compared to the state-of-the-art. Furthermore, a good equilibrium value of the Scoring Index <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>(</mo><mi>S</mi><mi>I</mi><mo>)</mo></mrow></semantics></math></inline-formula> was achieved by involving a linear combination of the General Information Loss <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>(</mo><msub><mi>G</mi><mrow><mi>I</mi><mi>L</mi></mrow></msub><mo>)</mo></mrow></semantics></math></inline-formula> and the General Disclosure Risk <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>(</mo><msub><mi>G</mi><mrow><mi>D</mi><mi>R</mi></mrow></msub><mo>)</mo></mrow></semantics></math></inline-formula>. |
topic |
micro-aggregation techniques genetic algorithm secure statistical databases information loss disclosure risk |
url |
https://www.mdpi.com/2306-5729/6/5/53 |
work_keys_str_mv |
AT ebaafayyoumi recursivegeneticmicroaggregationtechniqueinformationlossdisclosureriskandscoringindex AT omaralhuniti recursivegeneticmicroaggregationtechniqueinformationlossdisclosureriskandscoringindex |
_version_ |
1721414405750521856 |