The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] universal source coding(10hit)

1-10hit
  • An Efficient Bayes Coding Algorithm for Changing Context Tree Model

    Koshi SHIMADA  Shota SAITO  Toshiyasu MATSUSHIMA  

     
    PAPER-Source Coding and Data Compression

      Pubricized:
    2023/08/24
      Vol:
    E107-A No:3
      Page(s):
    448-457

    The context tree model has the property that the occurrence probability of symbols is determined from a finite past sequence and is a broader class of sources that includes i.i.d. or Markov sources. This paper proposes a non-stationary source with context tree models that change from interval to interval. The Bayes code for this source requires weighting of the posterior probabilities of the context tree models and change points, so the computational complexity of it usually increases to exponential order. Therefore, the challenge is how to reduce the computational complexity. In this paper, we propose a special class of prior probability distribution of context tree models and change points and develop an efficient Bayes coding algorithm by combining two existing Bayes coding algorithms. The algorithm minimizes the Bayes risk function of the proposed source in this paper, and the computational complexity of the proposed algorithm is polynomial order. We investigate the behavior and performance of the proposed algorithm by conducting experiments.

  • A Universal Two-Dimensional Source Coding by Means of Subblock Enumeration Open Access

    Takahiro OTA  Hiroyoshi MORITA  Akiko MANADA  

     
    PAPER-Information Theory

      Vol:
    E102-A No:2
      Page(s):
    440-449

    The technique of lossless compression via substring enumeration (CSE) is a kind of enumerative code and uses a probabilistic model built from the circular string of an input source for encoding a one-dimensional (1D) source. CSE is applicable to two-dimensional (2D) sources, such as images, by dealing with a line of pixels of a 2D source as a symbol of an extended alphabet. At the initial step of CSE encoding process, we need to output the number of occurrences of all symbols of the extended alphabet, so that the time complexity increases exponentially when the size of source becomes large. To reduce computational time, we can rearrange pixels of a 2D source into a 1D source string along a space-filling curve like a Hilbert curve. However, information on adjacent cells in a 2D source may be lost in the conversion. To reduce the time complexity and compress a 2D source without converting to a 1D source, we propose a new CSE which can encode a 2D source in a block-by-block fashion instead of in a line-by-line fashion. The proposed algorithm uses the flat torus of an input 2D source as a probabilistic model instead of the circular string of the source. Moreover, we prove the asymptotic optimality of the proposed algorithm for 2D general sources.

  • A Compact Tree Representation of an Antidictionary

    Takahiro OTA  Hiroyoshi MORITA  

     
    PAPER-Information Theory

      Vol:
    E100-A No:9
      Page(s):
    1973-1984

    In both theoretical analysis and practical use for an antidictionary coding algorithm, an important problem is how to encode an antidictionary of an input source. This paper presents a proposal for a compact tree representation of an antidictionary built from a circular string for an input source. We use a technique for encoding a tree in the compression via substring enumeration to encode a tree representation of the antidictionary. Moreover, we propose a new two-pass universal antidictionary coding algorithm by means of the proposal tree representation. We prove that the proposed algorithm is asymptotic optimal for a stationary ergodic source.

  • Lossless Data Compression via Substring Enumeration for k-th Order Markov Sources with a Finite Alphabet

    Ken-ichi IWATA  Mitsuharu ARIMURA  

     
    PAPER-Source Coding and Data Compression

      Vol:
    E99-A No:12
      Page(s):
    2130-2135

    A generalization of compression via substring enumeration (CSE) for k-th order Markov sources with a finite alphabet is proposed, and an upper bound of the codeword length of the proposed method is presented. We analyze the worst case maximum redundancy of CSE for k-th order Markov sources with a finite alphabet. The compression ratio of the proposed method asymptotically converges to the optimal one for k-th order Markov sources with a finite alphabet if the length n of a source string tends to infinity.

  • Evaluation of the Bayes Code from Viewpoints of the Distribution of Its Codeword Lengths

    Shota SAITO  Nozomi MIYA  Toshiyasu MATSUSHIMA  

     
    PAPER-Source Coding

      Vol:
    E98-A No:12
      Page(s):
    2407-2414

    This paper considers universal lossless variable-length source coding problem and investigates the Bayes code from viewpoints of the distribution of its codeword lengths. First, we show that the codeword lengths of the Bayes code satisfy the asymptotic normality. This study can be seen as the investigation on the asymptotic shape of the distribution of codeword lengths. Second, we show that the codeword lengths of the Bayes code satisfy the law of the iterated logarithm. This study can be seen as the investigation on the asymptotic end points of the distribution of codeword lengths. Moreover, the overflow probability, which represents the bottom of the distribution of codeword lengths, is studied for the Bayes code. We derive upper and lower bounds of the infimum of a threshold on the overflow probability under the condition that the overflow probability does not exceed ε∈(0,1). We also analyze the necessary and sufficient condition on a threshold for the overflow probability of the Bayes code to approach zero asymptotically.

  • Evaluation of Maximum Redundancy of Data Compression via Substring Enumeration for k-th Order Markov Sources

    Ken-ichi IWATA  Mitsuharu ARIMURA  Yuki SHIMA  

     
    PAPER-Information Theory

      Vol:
    E97-A No:8
      Page(s):
    1754-1760

    Dubé and Beaudoin proposed a lossless data compression called compression via substring enumeration (CSE) in 2010. We evaluate an upper bound of the number of bits used by the CSE technique to encode any binary string from an unknown member of a known class of k-th order Markov processes. We compare the worst case maximum redundancy obtained by the CSE technique for any binary string with the least possible value of the worst case maximum redundancy obtained by the best fixed-to-variable length code that satisfies the Kraft inequality.

  • Relationship among Complexities of Individual Sequences over Countable Alphabet

    Shigeaki KUZUOKA  Tomohiko UYEMATSU  

     
    PAPER-Information Theory

      Vol:
    E89-A No:7
      Page(s):
    2047-2055

    This paper investigates some relations among four complexities of sequence over countably infinite alphabet, and shows that two kinds of empirical entropies and the self-entropy rate regarding a Markov source are asymptotically equal and lower bounded by the maximum number of phrases in distinct parsing of the sequence. Some connections with source coding theorems are also investigated.

  • Analysis of Zero-Redundancy Estimator with a Finite Window for Markovian Source

    Mohammad M. RASHID  Tsutomu KAWABATA  

     
    PAPER-Information Theory

      Vol:
    E88-A No:10
      Page(s):
    2819-2825

    Prediction of actual symbol probability is crucial for statistical data compression that uses arithmetic coder. Krichevsky-Trofimov (KT) estimator has been a standard predictor and applied in CTW or FWCTW methods. However, KT-estimator performs poorly when non occurring symbols appear. To rectify this we proposed a zero-redundancy estimator, especially with a finite window(Rashid and Kawabata, ISIT2003) for non stationary source. In this paper, we analyze the zero-redundancy estimators in the case of Markovian source and give an asymptotic evaluation of the redundancy. We show that one of the estimators has the per symbol redundancy given by one half of the dimension of positive parameters divided by the window size when the window size is large.

  • Conditional Lempel-Ziv Complexity and Its Application to Source Coding Theorem with Side Information

    Tomohiko UYEMATSU  Shigeaki KUZUOKA  

     
    LETTER-Source Coding/Image Processing

      Vol:
    E86-A No:10
      Page(s):
    2615-2617

    This paper proposes the conditional LZ complexity and analyzes its property. Especially, we show an inequality corresponding to Ziv's inequality concerning a distinct parsing of a pair of sequences. Further, as a byproduct of the result, we show a simple proof of the asymptotical optimality of Ziv's universal source coding algorithm with side information.

  • A Context Tree Weighting Algorithm with an Incremental Context Set

    Tsutomu KAWABATA  Frans M. J. WILLEMS  

     
    PAPER-Source Coding and Data Compression

      Vol:
    E83-A No:10
      Page(s):
    1898-1903

    We propose a variation of the Context Tree Weighting algorithm for tree source modified such that the growth of the context resembles Lempel-Ziv parsing. We analyze this algorithm, give a concise upper bound to the individual redundancy for any tree source, and prove the asymptotic optimality of the data compression rate for any stationary and ergodic source.