The search functionality is under construction.
The search functionality is under construction.

Author Search Result

[Author] Kun JIANG(3hit)

1-3hit
  • Highly Compressed Lists of Integers with Dense Padding Modes

    Kun JIANG  Xingshen SONG  Yuexiang YANG  

     
    LETTER-Data Engineering, Web Information Systems

      Pubricized:
    2015/08/19
      Vol:
    E98-D No:11
      Page(s):
    1986-1989

    Index compression is partially responsible for the current performance achievements of Internet search engines. Among many latest compression techniques, Simple9 can pack as many integers as possible into a single 32-bit machine word using 9 different padding modes. However, the number of wasted bits in Simple9 remains large. In previous works, researchers have focused on reducing the unused trailing bits of the padding modes and have proposed various additional modes that make full use of the cases of the status bits. Instead, we focus on the wasted bits in the integer list, padding extra zeros for a complete dense mode when the number of integers is not enough to fit a complete mode. More precisely, we first propose a novel index compression method called SimpleD with dense padding modes to achieve a more compact storage compared with that of Simple9. We then design an innovative metric for extracting the inserted extra zero integers during the decoding phase. Experiments on the TREC WT2G and GOV2 datasets show that our encoder outperforms Simple9 while still retaining a very fast decompression speed.

  • Dominant Fairness Fairness: Hierarchical Scheduling for Multiple Resources in Heterogeneous Datacenters

    Wenzhu WANG  Kun JIANG  Yusong TAN  Qingbo WU  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2016/03/03
      Vol:
    E99-D No:6
      Page(s):
    1678-1681

    Hierarchical scheduling for multiple resources is partially responsible for the performance achievements in large scale datacenters. However, the latest scheduling technique, Hierarchy Dominant Resource Fairness (H-DRF)[1], has some shortcomings in heterogeneous environments, such as starving certain jobs or unfair resource allocation. This is because a heterogeneous environment brings new challenges. In this paper, we propose a novel scheduling algorithm called Dominant Fairness Fairness (DFF). DFF tries to keep resource allocation fair, avoid job starvation, and improve system resource utilization. We implement DFF in the YARN system, a most commonly used scheduler for large scale clusters. The experimental results show that our proposed algorithm leads to higher resource utilization and better throughput than H-DRF.

  • Well-Balanced Successive Simple-9 for Inverted Lists Compression

    Kun JIANG  Yuexiang YANG  Qinghua ZHENG  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2017/04/17
      Vol:
    E100-D No:7
      Page(s):
    1416-1424

    The growth in the amount of information available on the Internet and thousands of user queries per second brings huge challenges to the index update and query processing of search engines. Index compression is partially responsible for the current performance achievements of existing search engines. The selection of the index compression algorithms must weigh three factors, i.e., compression ratio, compression speed and decompression speed. In this paper, we study the well-known Simple-9 compression, in which exist many branch operations, table lookup and data transfer operations when processing each 32-bit machine word. To enhance the compression and decompression performance of Simple-9 algorithm, we propose a successive storage structure and processing metric to compress two successive Simple-9 encoded sequence of integers in a single data processing procedure, thus the name Successive Simple-9 (SSimple-9). In essence, the algorithm shortens the process of branch operations, table lookup and data transfer operations when compressing the integer sequence. More precisely, we initially present the data storage format and mask table of SSimple-9 algorithm. Then, for each mode in the mask table, we design and hard-code the main steps of the compression and decompression processes. Finally, analysis and comparison on the experimental results of the simulation and TREC datasets show the compression and decompression efficiency speedup of the proposed SSimple-9 algorithm.