Time-Universal Data Compression

Nowadays, a variety of data-compressors (or archivers) is available, each of which has its merits, and it is impossible to single out the best ones. Thus, one faces the problem of choosing the best method to compress a given file, and this problem is more important the larger is the file. It seems n...

Full description

Bibliographic Details
Main Author: Boris Ryabko
Format: Article
Language:English
Published: MDPI AG 2019-05-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/12/6/116
id doaj-206d867f303e4025a507ebd0dc227b47
record_format Article
spelling doaj-206d867f303e4025a507ebd0dc227b472020-11-25T01:16:17ZengMDPI AGAlgorithms1999-48932019-05-0112611610.3390/a12060116a12060116Time-Universal Data CompressionBoris Ryabko0Institute of Computational Technologies of the Siberian Branch of the Russian Academy of Science, 630090 Novosibirsk, RussiaNowadays, a variety of data-compressors (or archivers) is available, each of which has its merits, and it is impossible to single out the best ones. Thus, one faces the problem of choosing the best method to compress a given file, and this problem is more important the larger is the file. It seems natural to try all the compressors and then choose the one that gives the shortest compressed file, then transfer (or store) the index number of the best compressor (it requires <inline-formula> <math display="inline"> <semantics> <mrow> <mo form="prefix">log</mo> <mi>m</mi> </mrow> </semantics> </math> </inline-formula> bits, if <i>m</i> is the number of compressors available) and the compressed file. The only problem is the time, which essentially increases due to the need to compress the file <i>m</i> times (in order to find the best compressor). We suggest a method of data compression whose performance is close to optimal, but for which the extra time needed is relatively small: the ratio of this extra time and the total time of calculation can be limited, in an asymptotic manner, by an arbitrary positive constant. In short, the main idea of the suggested approach is as follows: in order to find the best, try all the data compressors, but, when doing so, use for compression only a small part of the file. Then apply the best data compressors to the whole file. Note that there are many situations where it may be necessary to find the best data compressor out of a given set. In such a case, it is often done by comparing compressors empirically. One of the goals of this work is to turn such a selection process into a part of the data compression method, automating and optimizing it.https://www.mdpi.com/1999-4893/12/6/116data compressionuniversal codingtime-series forecasting
collection DOAJ
language English
format Article
sources DOAJ
author Boris Ryabko
spellingShingle Boris Ryabko
Time-Universal Data Compression
Algorithms
data compression
universal coding
time-series forecasting
author_facet Boris Ryabko
author_sort Boris Ryabko
title Time-Universal Data Compression
title_short Time-Universal Data Compression
title_full Time-Universal Data Compression
title_fullStr Time-Universal Data Compression
title_full_unstemmed Time-Universal Data Compression
title_sort time-universal data compression
publisher MDPI AG
series Algorithms
issn 1999-4893
publishDate 2019-05-01
description Nowadays, a variety of data-compressors (or archivers) is available, each of which has its merits, and it is impossible to single out the best ones. Thus, one faces the problem of choosing the best method to compress a given file, and this problem is more important the larger is the file. It seems natural to try all the compressors and then choose the one that gives the shortest compressed file, then transfer (or store) the index number of the best compressor (it requires <inline-formula> <math display="inline"> <semantics> <mrow> <mo form="prefix">log</mo> <mi>m</mi> </mrow> </semantics> </math> </inline-formula> bits, if <i>m</i> is the number of compressors available) and the compressed file. The only problem is the time, which essentially increases due to the need to compress the file <i>m</i> times (in order to find the best compressor). We suggest a method of data compression whose performance is close to optimal, but for which the extra time needed is relatively small: the ratio of this extra time and the total time of calculation can be limited, in an asymptotic manner, by an arbitrary positive constant. In short, the main idea of the suggested approach is as follows: in order to find the best, try all the data compressors, but, when doing so, use for compression only a small part of the file. Then apply the best data compressors to the whole file. Note that there are many situations where it may be necessary to find the best data compressor out of a given set. In such a case, it is often done by comparing compressors empirically. One of the goals of this work is to turn such a selection process into a part of the data compression method, automating and optimizing it.
topic data compression
universal coding
time-series forecasting
url https://www.mdpi.com/1999-4893/12/6/116
work_keys_str_mv AT borisryabko timeuniversaldatacompression
_version_ 1725150406886031360