The search functionality is under construction.

Author Search Result

[Author] Takashi MORIHARA(1hit)

1-1hit
  • Application of a Word-Based Text Compression Method to Japanese and Chinese Texts

    Shigeru YOSHIDA  Takashi MORIHARA  Hironori YAHAGI  Noriko ITANI  

     
    PAPER-Information Theory

      Vol:
    E85-A No:12
      Page(s):
    2933-2938

    16-bit Asian language codes can not be compressed well by conventional 8-bit sampling text compression schemes. Previously, we reported the application of a word-based text compression method that uses 16-bit sampling for the compression of Japanese texts. This paper describes our further efforts in applying a word-based method with a static canonical Huffman encoder to both Japanese and Chinese texts. The method was proposed to support a multilingual environment, as we replaced the word-dictionary and the canonical Huffman code table for the respective language appropriately. A computer simulation showed that this method is effective for both languages. The obtained compression ratio was a little less than 0.5 without regarding the Markov context, and around 0.4 when accounting for the first order Markov context.