Pattern compression.

Compression is an important process that enables communication and storage of information to be more efficient. This document presents a new method made for compressing document imagery. Document imagery refers to electronic picture of printed documents created by fax machine, scanner and some softw...

Full description

Bibliographic Details
Main Author: Blais, Pascal.
Other Authors: Laganiere, Robert
Format: Others
Published: University of Ottawa (Canada) 2009
Subjects:
Online Access:http://hdl.handle.net/10393/8526
http://dx.doi.org/10.20381/ruor-7353
Description
Summary:Compression is an important process that enables communication and storage of information to be more efficient. This document presents a new method made for compressing document imagery. Document imagery refers to electronic picture of printed documents created by fax machine, scanner and some software. The most important aspects of this new method called Pattern Compression, is that it takes advantage of the kind of information a document imagery is made of and that some data may be lost in the process as long as it remains visually imperceptible. The method is a variation of vector quantization. The input document is broken down into numerous small 2 dimensional patterns, which correspond typically to the characters of a text. A codebook is created from these patterns. Patterns that are very similar are considered to be the same. A document is compressed by representing it with a sequence of codebook representatives. Statistics gathered from various experiments are presented in order to compare the efficiency of the new method with other known methods. They show that Pattern Compression generally compress around twice as much as the CCITT group 4 compression standard. Other advantages of the method are that by altering the value of the algorithm parameters, a user can control the quality of the compression. The decompression process is very fast and the technique can also be applied to color images. Unfortunately the method have some problems: the compression process is rather long, it has a problem braking down big pattern and its compression ratio is not as good as another compression method called Cartesian Perceptual Compression. Future work and ideas on how to improve Pattern Compression are finally suggested.