Time-Universal Data Compression
Nowadays, a variety of data-compressors (or archivers) is available, each of which has its merits, and it is impossible to single out the best ones. Thus, one faces the problem of choosing the best method to compress a given file, and this problem is more important the larger is the file. It seems n...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2019-05-01
|
Series: | Algorithms |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4893/12/6/116 |
id |
doaj-206d867f303e4025a507ebd0dc227b47 |
---|---|
record_format |
Article |
spelling |
doaj-206d867f303e4025a507ebd0dc227b472020-11-25T01:16:17ZengMDPI AGAlgorithms1999-48932019-05-0112611610.3390/a12060116a12060116Time-Universal Data CompressionBoris Ryabko0Institute of Computational Technologies of the Siberian Branch of the Russian Academy of Science, 630090 Novosibirsk, RussiaNowadays, a variety of data-compressors (or archivers) is available, each of which has its merits, and it is impossible to single out the best ones. Thus, one faces the problem of choosing the best method to compress a given file, and this problem is more important the larger is the file. It seems natural to try all the compressors and then choose the one that gives the shortest compressed file, then transfer (or store) the index number of the best compressor (it requires <inline-formula> <math display="inline"> <semantics> <mrow> <mo form="prefix">log</mo> <mi>m</mi> </mrow> </semantics> </math> </inline-formula> bits, if <i>m</i> is the number of compressors available) and the compressed file. The only problem is the time, which essentially increases due to the need to compress the file <i>m</i> times (in order to find the best compressor). We suggest a method of data compression whose performance is close to optimal, but for which the extra time needed is relatively small: the ratio of this extra time and the total time of calculation can be limited, in an asymptotic manner, by an arbitrary positive constant. In short, the main idea of the suggested approach is as follows: in order to find the best, try all the data compressors, but, when doing so, use for compression only a small part of the file. Then apply the best data compressors to the whole file. Note that there are many situations where it may be necessary to find the best data compressor out of a given set. In such a case, it is often done by comparing compressors empirically. One of the goals of this work is to turn such a selection process into a part of the data compression method, automating and optimizing it.https://www.mdpi.com/1999-4893/12/6/116data compressionuniversal codingtime-series forecasting |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Boris Ryabko |
spellingShingle |
Boris Ryabko Time-Universal Data Compression Algorithms data compression universal coding time-series forecasting |
author_facet |
Boris Ryabko |
author_sort |
Boris Ryabko |
title |
Time-Universal Data Compression |
title_short |
Time-Universal Data Compression |
title_full |
Time-Universal Data Compression |
title_fullStr |
Time-Universal Data Compression |
title_full_unstemmed |
Time-Universal Data Compression |
title_sort |
time-universal data compression |
publisher |
MDPI AG |
series |
Algorithms |
issn |
1999-4893 |
publishDate |
2019-05-01 |
description |
Nowadays, a variety of data-compressors (or archivers) is available, each of which has its merits, and it is impossible to single out the best ones. Thus, one faces the problem of choosing the best method to compress a given file, and this problem is more important the larger is the file. It seems natural to try all the compressors and then choose the one that gives the shortest compressed file, then transfer (or store) the index number of the best compressor (it requires <inline-formula> <math display="inline"> <semantics> <mrow> <mo form="prefix">log</mo> <mi>m</mi> </mrow> </semantics> </math> </inline-formula> bits, if <i>m</i> is the number of compressors available) and the compressed file. The only problem is the time, which essentially increases due to the need to compress the file <i>m</i> times (in order to find the best compressor). We suggest a method of data compression whose performance is close to optimal, but for which the extra time needed is relatively small: the ratio of this extra time and the total time of calculation can be limited, in an asymptotic manner, by an arbitrary positive constant. In short, the main idea of the suggested approach is as follows: in order to find the best, try all the data compressors, but, when doing so, use for compression only a small part of the file. Then apply the best data compressors to the whole file. Note that there are many situations where it may be necessary to find the best data compressor out of a given set. In such a case, it is often done by comparing compressors empirically. One of the goals of this work is to turn such a selection process into a part of the data compression method, automating and optimizing it. |
topic |
data compression universal coding time-series forecasting |
url |
https://www.mdpi.com/1999-4893/12/6/116 |
work_keys_str_mv |
AT borisryabko timeuniversaldatacompression |
_version_ |
1725150406886031360 |