Summary: | 碩士 === 國立中正大學 === 通訊工程研究所 === 100 === With the rise of privacy protection awareness and legal norms, the personal identifiable information (PII) is needed to prevent unauthorized or unnecessary release of confidential information. Traditional re-identification risk tools examine the privacy effectiveness of the released contents and tests whether the released information has any privacy risk (such as k-anonymity) or not. For a truly successful privacy management, we need to carefully perform de-identification process for PII, then construct a reliable risk assessment for the released information. However, these two tasks (de-identification and risk assessment) are cycle structures, which balance the tradeoff between information loss and re-identification risk, and are very time-consuming and labor-intensive processes. To organize an effective risk assessment and verify de-identification dataset with k-anonymity threshold, this paper proposed a novel approach to construct a cost-effective “penetration database” to represent the original database but its number of tuples is less than the original database. The experimental results show that our approach could test de-identified data whether satisfying k-anonymity or not and speed up the re-identification risk assessment task in an overall de-identification cycle.
|