The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] universal coding(10hit)

1-10hit
  • Equivalences among Some Information Measures for Individual Sequences and Their Applications for Fixed-Length Coding Problems

    Tomohiko UYEMATSU  Tetsunao MATSUTA  

     
    PAPER-Source Coding and Data Compression

      Pubricized:
    2023/08/16
      Vol:
    E107-A No:3
      Page(s):
    393-403

    This paper proposes three new information measures for individual sequences and clarifies their properties. Our new information measures are called as the non-overlapping max-entropy, the overlapping smooth max-entropy, and the non-overlapping smooth max-entropy, respectively. These measures are related to the fixed-length coding of individual sequences. We investigate these measures, and show the following three properties: (1) The non-overlapping max-entropy coincides with the topological entropy. (2) The overlapping smooth max-entropy and the non-overlapping smooth max-entropy coincide with the Ziv-entropy. (3) When an individual sequence is drawn from an ergodic source, the overlapping smooth max-entropy and the non-overlapping smooth max-entropy coincide with the entropy rate of the source. Further, we apply these information measures to the fixed-length coding of individual sequences, and propose some new universal coding schemes which are asymptotically optimum.

  • A Sufficient Condition for the Existence of a Universal Slepian-Wolf Code

    Shigeaki KUZUOKA  

     
    PAPER-Information Theory

      Vol:
    E93-A No:7
      Page(s):
    1355-1362

    Universal Slepian-Wolf coding for parametric general sources is considered. Our main result shows that under mild conditions on the family of sources, there exists a universal decoder that attains asymptotically the same random-coding error exponent as the maximum-likelihood decoder.

  • Fixed-Slope Universal Lossy Coding for Individual Sequences and Nonstationary Sources

    Shigeaki KUZUOKA  Tomohiko UYEMATSU  

     
    PAPER-Information Theory

      Vol:
    E91-A No:3
      Page(s):
    836-845

    This paper investigates the fixed-slope lossy coding of individual sequences and nonstationary sources. We clarify that, for a given individual sequence, the optimal cost attainable by the blockwise lossy encoders is equal to the optimal average cost with respect to the empirical distribution of the given sequence. Moreover, we show that, for a given nonstationary source, the optimal cost attainable by the blockwise encoders is equal to the supremum of the optimal average cost over all the stationary sources in the stationary hull of the given source. In addition, we show that the universal lossy coding algorithm based on Lempel-Ziv 78 code attains the optimal cost for any individual sequence and any nonstationary source.

  • Universal Coding for Correlated Sources with Complementary Delivery

    Akisato KIMURA  Tomohiko UYEMATSU  Shigeaki KUZUOKA  

     
    PAPER

      Vol:
    E90-A No:9
      Page(s):
    1840-1847

    This paper deals with a universal coding problem for a certain kind of multiterminal source coding system that we call the complementary delivery coding system. In this system, messages from two correlated sources are jointly encoded, and each decoder has access to one of the two messages to enable it to reproduce the other message. Both fixed-to-fixed length and fixed-to-variable length lossless coding schemes are considered. Explicit constructions of universal codes and bounds of the error probabilities are clarified via type-theoretical and graph-theoretical analyses.

  • Intrinsic Randomness Problem in the Framework of Slepian-Wolf Separate Coding System

    Yasutada OOHAMA  

     
    PAPER-Information Theory

      Vol:
    E90-A No:7
      Page(s):
    1406-1417

    This paper deals with the random number generation problem under the framework of a separate coding system for correlated memoryless sources posed and investigated by Slepian and Wolf. Two correlated data sequences with length n are separately encoded to nR1, nR2 bit messages at each location and those are sent to the information processing center where the encoder wish to generate an approximation of the sequence of independent uniformly distributed random variables with length nR3 from two received random messages. The admissible rate region is defined by the set of all the triples (R1,R2,R3) for which the approximation error goes to zero as n tends to infinity. In this paper we examine the asymptotic behavior of the approximation error inside and outside the admissible rate region. We derive an explicit lower bound of the optimal exponent for the approximation error to vanish and show that it can be attained by the universal codes. Furthermore, we derive an explicit lower bound of the optimal exponent for the approximation error to tend to 2 as n goes to infinity outside the admissible rate region.

  • Universal Lossy Coding for Individual Sequences Based on Complexity Functions

    Shigeaki KUZUOKA  Tomohiko UYEMATSU  

     
    PAPER-Information Theory

      Vol:
    E90-A No:2
      Page(s):
    491-503

    This paper investigates the fixed-rate and fixed-distortion lossy coding problems of individual sequences subject to the subadditive distortion measure. The fixed-rate and fixed-distortion universal lossy coding schemes based on the complexity of the sequence are proposed. The obtained coding theorems reveal that the optimal distortion (resp. rate) attainable by the fixed-rate (resp. fixed-distortion) lossy coding is equal to the optimal average distortion (resp. rate) with respect to the overlapping empirical distribution of the given sequence. Some connections with the lossy coding problem of ergodic sources are also investigated.

  • A Generalization of B. S. Clarke and A. R. Barron's Asymptotics of Bayes Codes for FSMX Sources

    Masayuki GOTOH  Toshiyasu MATSUSHIMA  Shigeichi HIRASAWA  

     
    PAPER-Source Coding

      Vol:
    E81-A No:10
      Page(s):
    2123-2132

    We shall generalize B. S. Clarke and A. R. Barron 's analysis of the Bayes method for the FSMX sources. The FSMX source considered here is specified by the set of all states and its parameter value. At first, we show the asymptotic codelengths of individual sequences of the Bayes codes for the FSMX sources. Secondly, we show the asymptotic expected codelengths. The Bayesian posterior density and the maximum likelihood estimator satisfy asymptotic normality for the finite ergodic Markov source, and this is the key of our analysis.

  • Data Compression by Context Sorting

    Hidetoshi YOKOO  Masaharu TAKAHASHI  

     
    PAPER-Information Theory and Coding Theory

      Vol:
    E79-A No:5
      Page(s):
    681-686

    This paper proposes a new lossless data compression method, which utilizes a context sorting algorithm. Every symbol in the data can be predicted by taking its immediately preceding characters, or context, into account. The context sorting algorithm sorts a set of all the previous contexts to find the most similar context to the current one. It then predicts the next symbol by sorting previous symbol-context pairs in an order of context similarity. The codeword for the next symbol represents the rank of the symbol in this sorted sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method operates character by character, with no probability distribution used to make a prediction, it has comparable compression performance to the best known data compression utilities.

  • Highly Efficient Universal Coding with Classifying to Subdictionaries for Text Compression

    Yasuhiko NAKANO  Hironori YAHAGI  Yoshiyuki OKADA  Shigeru YOSHIDA  

     
    PAPER-Algorithms, Data Structures and Computational Complexity

      Vol:
    E77-A No:9
      Page(s):
    1520-1526

    We developed a simple, practical, adaptive data compression algorithm of the LZ78 class. According to the Lempel-Ziv greedy parsing, a string boundary is not related to the statistical history modeled by finite-state sources. We have already reported an algorithm classifying data into subdictionaries (CSD), which uses multiple subdictionaries and conditions the current string by using the previous one to obtain a higher compression ratio. In this paper, we present a practical implementation of this method suitable for any kinds of data, and show that CSD is more efficient than the LZC which is the method used by the program compress available on UNIX systems. The CSD compression performance was about 10% better than that of LZC with the practical dictionary size, an 8k-entry dictionary when the test data was from the Calgary Compression Corpus. With hashing, the CSD processing speed became as fast as that of LZC, although the CSD algorithm was more complicated than LZC.

  • A Universal Coding Scheme Based on Minimizing Minimax Redundancy for Sources with an Unknown Model

    Joe SUZUKI  

     
    PAPER-Information Theory and Coding Theory

      Vol:
    E76-A No:7
      Page(s):
    1234-1239

    This paper's main objective is to clearly describe the construction of a universal code for minimizing Davisson's minimax redundancy in a range where the true model and stochastic parameters are unknown. Minimax redundancy is defined as the maximum difference between the expected persymbol code length and the per-symbol source entropy in the source range. A universal coding scheme is here formulated in terms of the weight function, i.e., a method is presented for determining a weight function which minimizes the minimax redundancy even when the true model is unknown. It is subsequently shown that the minimax redundancy achieved through the presented coding method is upper-bounded by the minimax redundancy of Rissanen's semi-predictive coding method.