Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data Representation

Non-uniquely-decodable (non-UD) codes can be defined as the codes that cannot be uniquely decoded without additional disambiguation information. These are mainly the class of non–prefix–free codes, where a code-word can be a prefix of other(s), and thus, the code-word boundary in...

Full description

Bibliographic Details
Main Authors: Muhammed Oğuzhan Külekci, Yasin Öztürk
Format: Article
Language:English
Published: MDPI AG 2019-04-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/12/4/78
id doaj-6e6d50fb010e40b6b03ab57e87d0d840
record_format Article
spelling doaj-6e6d50fb010e40b6b03ab57e87d0d8402020-11-24T22:19:42ZengMDPI AGAlgorithms1999-48932019-04-011247810.3390/a12040078a12040078Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data RepresentationMuhammed Oğuzhan Külekci0Yasin Öztürk1Informatics Institute, Istanbul Technical University, 34469 Istanbul, TurkeyInformatics Institute, Istanbul Technical University, 34469 Istanbul, TurkeyNon-uniquely-decodable (non-UD) codes can be defined as the codes that cannot be uniquely decoded without additional disambiguation information. These are mainly the class of non–prefix–free codes, where a code-word can be a prefix of other(s), and thus, the code-word boundary information is essential for correct decoding. Due to their inherent unique decodability problem, such non-UD codes have not received much attention except a few studies, in which using compressed data structures to represent the disambiguation information efficiently had been previously proposed. It had been shown before that the compression ratio can get quite close to Huffman/Arithmetic codes with an additional capability of providing direct access in compressed data, which is a missing feature in the regular Huffman codes. In this study we investigate non-UD codes in another dimension addressing the privacy of the high-entropy data. We particularly focus on such massive volumes, where typical examples are encoded video or similar multimedia files. Representation of such a volume with non–UD coding creates two elements as the disambiguation information and the payload, where decoding the original data from these elements becomes hard when one of them is missing. We make use of this observation for privacy concerns. and study the space consumption as well as the hardness of that decoding. We conclude that non-uniquely-decodable codes can be an alternative to selective encryption schemes that aim to secure only part of the data when data is huge. We provide a freely available software implementation of the proposed scheme as well.https://www.mdpi.com/1999-4893/12/4/78non-UDnon-prefix-free codesselective encryptionmassive data securitydata codingdata compressionprivacy preserving text algorithmsbig data delivery
collection DOAJ
language English
format Article
sources DOAJ
author Muhammed Oğuzhan Külekci
Yasin Öztürk
spellingShingle Muhammed Oğuzhan Külekci
Yasin Öztürk
Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data Representation
Algorithms
non-UD
non-prefix-free codes
selective encryption
massive data security
data coding
data compression
privacy preserving text algorithms
big data delivery
author_facet Muhammed Oğuzhan Külekci
Yasin Öztürk
author_sort Muhammed Oğuzhan Külekci
title Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data Representation
title_short Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data Representation
title_full Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data Representation
title_fullStr Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data Representation
title_full_unstemmed Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data Representation
title_sort applications of non-uniquely decodable codes to privacy-preserving high-entropy data representation
publisher MDPI AG
series Algorithms
issn 1999-4893
publishDate 2019-04-01
description Non-uniquely-decodable (non-UD) codes can be defined as the codes that cannot be uniquely decoded without additional disambiguation information. These are mainly the class of non–prefix–free codes, where a code-word can be a prefix of other(s), and thus, the code-word boundary information is essential for correct decoding. Due to their inherent unique decodability problem, such non-UD codes have not received much attention except a few studies, in which using compressed data structures to represent the disambiguation information efficiently had been previously proposed. It had been shown before that the compression ratio can get quite close to Huffman/Arithmetic codes with an additional capability of providing direct access in compressed data, which is a missing feature in the regular Huffman codes. In this study we investigate non-UD codes in another dimension addressing the privacy of the high-entropy data. We particularly focus on such massive volumes, where typical examples are encoded video or similar multimedia files. Representation of such a volume with non–UD coding creates two elements as the disambiguation information and the payload, where decoding the original data from these elements becomes hard when one of them is missing. We make use of this observation for privacy concerns. and study the space consumption as well as the hardness of that decoding. We conclude that non-uniquely-decodable codes can be an alternative to selective encryption schemes that aim to secure only part of the data when data is huge. We provide a freely available software implementation of the proposed scheme as well.
topic non-UD
non-prefix-free codes
selective encryption
massive data security
data coding
data compression
privacy preserving text algorithms
big data delivery
url https://www.mdpi.com/1999-4893/12/4/78
work_keys_str_mv AT muhammedoguzhankulekci applicationsofnonuniquelydecodablecodestoprivacypreservinghighentropydatarepresentation
AT yasinozturk applicationsofnonuniquelydecodablecodestoprivacypreservinghighentropydatarepresentation
_version_ 1725777941657288704