IEICE global.ieice.org Site

Keyword Search Result

[Keyword] compression via substring enumeration(4hit)

1-4hit

A Universal Two-Dimensional Source Coding by Means of Subblock Enumeration Open Access
Takahiro OTA Hiroyoshi MORITA Akiko MANADA

PAPER-Information Theory

Vol:
E102-A No:2
Page(s):
440-449
The technique of lossless compression via substring enumeration (CSE) is a kind of enumerative code and uses a probabilistic model built from the circular string of an input source for encoding a one-dimensional (1D) source. CSE is applicable to two-dimensional (2D) sources, such as images, by dealing with a line of pixels of a 2D source as a symbol of an extended alphabet. At the initial step of CSE encoding process, we need to output the number of occurrences of all symbols of the extended alphabet, so that the time complexity increases exponentially when the size of source becomes large. To reduce computational time, we can rearrange pixels of a 2D source into a 1D source string along a space-filling curve like a Hilbert curve. However, information on adjacent cells in a 2D source may be lost in the conversion. To reduce the time complexity and compress a 2D source without converting to a 1D source, we propose a new CSE which can encode a 2D source in a block-by-block fashion instead of in a line-by-line fashion. The proposed algorithm uses the flat torus of an input 2D source as a probabilistic model instead of the circular string of the source. Moreover, we prove the asymptotic optimality of the proposed algorithm for 2D general sources.
A Compact Tree Representation of an Antidictionary
Takahiro OTA Hiroyoshi MORITA

PAPER-Information Theory

Vol:
E100-A No:9
Page(s):
1973-1984
In both theoretical analysis and practical use for an antidictionary coding algorithm, an important problem is how to encode an antidictionary of an input source. This paper presents a proposal for a compact tree representation of an antidictionary built from a circular string for an input source. We use a technique for encoding a tree in the compression via substring enumeration to encode a tree representation of the antidictionary. Moreover, we propose a new two-pass universal antidictionary coding algorithm by means of the proposal tree representation. We prove that the proposed algorithm is asymptotic optimal for a stationary ergodic source.
Lossless Data Compression via Substring Enumeration for k-th Order Markov Sources with a Finite Alphabet
Ken-ichi IWATA Mitsuharu ARIMURA

PAPER-Source Coding and Data Compression

Vol:
E99-A No:12
Page(s):
2130-2135
A generalization of compression via substring enumeration (CSE) for k-th order Markov sources with a finite alphabet is proposed, and an upper bound of the codeword length of the proposed method is presented. We analyze the worst case maximum redundancy of CSE for k-th order Markov sources with a finite alphabet. The compression ratio of the proposed method asymptotically converges to the optimal one for k-th order Markov sources with a finite alphabet if the length n of a source string tends to infinity.
Evaluation of Maximum Redundancy of Data Compression via Substring Enumeration for k-th Order Markov Sources
Ken-ichi IWATA Mitsuharu ARIMURA Yuki SHIMA

PAPER-Information Theory

Vol:
E97-A No:8
Page(s):
1754-1760
Dubé and Beaudoin proposed a lossless data compression called compression via substring enumeration (CSE) in 2010. We evaluate an upper bound of the number of bits used by the CSE technique to encode any binary string from an unknown member of a known class of k-th order Markov processes. We compare the worst case maximum redundancy obtained by the CSE technique for any binary string with the least possible value of the worst case maximum redundancy obtained by the best fixed-to-variable length code that satisfies the Kraft inequality.