The search functionality is under construction.

Author Search Result

[Author] Lei ZHOU(11hit)

1-11hit
  • Distributed Synchronization for Message-Passing Based Embedded Multiprocessors

    Hao XIAO  Ning WU  Fen GE  Guanyu ZHU  Lei ZHOU  

     
    LETTER-Architecture

      Vol:
    E98-D No:2
      Page(s):
    272-275

    This paper presents a synchronization mechanism to effectively implement the lock and barrier protocols in a decentralized manner through explicit message passing. In the proposed solution, a simple and efficient synchronization control mechanism is proposed to support queued synchronization without contention. By using state-of-the-art Application-Specific Instruction-set Processor (ASIP) technology, we embed the synchronization functionality into a baseline processor, making the proposed mechanism feature ultra-low overhead. Experimental results show the proposed synchronization achieves ultra-low latency and almost ideal scalability when the number of processors increases.

  • Optimal Multicast Tree Routing for Cluster Computing in Hypercube Interconnection Networks

    Weijia JIA  Bo HAN  Pui On AU  Yong HE  Wanlei ZHOU  

     
    PAPER-Networking and System Architectures

      Vol:
    E87-D No:7
      Page(s):
    1625-1632

    Cluster computation has been used in the applications that demand performance, reliability, and availability, such as cluster server groups, large-scale scientific computations, distributed databases, distributed media-on-demand servers and search engines etc. In those applications, multicast can play the vital roles for the information dissemination among groups of servers and users. This paper proposes a set of novel efficient fault-tolerant multicast routing algorithms on hypercube interconnection of cluster computers using multicast shared tree approach. We present some new algorithms for selecting an optimal core (root) and constructing the shared tree so as to minimize the average delay for multicast messages. Simulation results indicate that our algorithms are efficient in the senses of short end-to-end average delay, load balance and less resource utilizations over hypercube cluster interconnection networks.

  • The Repacking Efficiency for Bandwidth Packing Problem

    Jianxin CHEN  Yuhang YANG  Lei ZHOU  

     
    PAPER-Complexity Theory

      Vol:
    E90-D No:7
      Page(s):
    1011-1017

    Repacking is an efficient scheme for bandwidth packing problem (BPP) in centralized networks (CNs), where a central unit allocates bandwidth to the rounding terminals. In this paper, we study its performance by proposing a new formulation of the BPP in the CN, and introducing repacking scheme into next fit algorithm in terms of the online constraint. For the realistic applications, the effect of call demand distribution is also exploited by means of simulation. The results show that the repacking efficiency is significant (e.g. the minimal improvement about 13% over uniform distribution), especially in the scenarios where the small call demands dominate the network.

  • Inference Discrepancy Based Curriculum Learning for Neural Machine Translation

    Lei ZHOU  Ryohei SASANO  Koichi TAKEDA  

     
    PAPER-Natural Language Processing

      Pubricized:
    2023/10/18
      Vol:
    E107-D No:1
      Page(s):
    135-143

    In practice, even a well-trained neural machine translation (NMT) model can still make biased inferences on the training set due to distribution shifts. For the human learning process, if we can not reproduce something correctly after learning it multiple times, we consider it to be more difficult. Likewise, a training example causing a large discrepancy between inference and reference implies higher learning difficulty for the MT model. Therefore, we propose to adopt the inference discrepancy of each training example as the difficulty criterion, and according to which rank training examples from easy to hard. In this way, a trained model can guide the curriculum learning process of an initial model identical to itself. We put forward an analogy to this training scheme as guiding the learning process of a curriculum NMT model by a pretrained vanilla model. In this paper, we assess the effectiveness of the proposed training scheme and take an insight into the influence of translation direction, evaluation metrics and different curriculum schedules. Experimental results on translation benchmarks WMT14 English ⇒ German, WMT17 Chinese ⇒ English and Multitarget TED Talks Task (MTTT) English ⇔ German, English ⇔ Chinese, English ⇔ Russian demonstrate that our proposed method consistently improves the translation performance against the advanced Transformer baseline.

  • Optimization Problems for Consecutive-k-out-of-n:G Systems

    Lei ZHOU  Hisashi YAMAMOTO  Taishin NAKAMURA  Xiao XIAO  

     
    PAPER-Reliability, Maintainability and Safety Analysis

      Vol:
    E103-A No:5
      Page(s):
    741-748

    A consecutive-k-out-of-n:G system consists of n components which are arranged in a line and the system works if and only if at least k consecutive components work. This paper discusses the optimization problems for a consecutive-k-out-of-n:G system. We first focus on the optimal number of components at the system design phase. Then, we focus on the optimal replacement time at the system operation phase by considering a preventive replacement, which the system is replaced at the planned time or the time of system failure which occurs first. The expected cost rates of two optimization problems are considered as objective functions to be minimized. Finally, we give study cases for the proposed optimization problems and evaluate the feasibility of the policies.

  • A Block Smoothing-Based Method for Flicker Removal in Image Sequences

    Lei ZHOU  Qiang NI  Yuanhua ZHOU  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E89-D No:4
      Page(s):
    1578-1581

    An automatic and efficient algorithm for removal of intensity flicker is proposed. The novel repair process is founded on the block-based estimation and restoration algorithm with regard to luminance variation. It is easily realized and controlled to remove most intensity flicker and preserve the wanted effects, like fade in and fade out.

  • Number of Failed Components in Consecutive-k-out-of-n:G Systems and Their Applications in Optimization Problems

    Lei ZHOU  Hisashi YAMAMOTO  

     
    PAPER-Reliability, Maintainability and Safety Analysis

      Pubricized:
    2021/12/16
      Vol:
    E105-A No:6
      Page(s):
    943-951

    In this paper, we study the number of failed components in a consecutive-k-out-of-n:G system. The distributions and expected values of the number of failed components when system is failed or working at a particular time t are evaluated. We also apply them to the optimization problems concerned with the optimal number of components and the optimal replacement time. Finally, we present the illustrative examples for the expected number of failed components and give the numerical results for the optimization problems.

  • A Design Methodology for Three-Dimensional Hybrid NoC-Bus Architecture

    Lei ZHOU  Ning WU  Xin CHEN  

     
    PAPER

      Vol:
    E96-C No:4
      Page(s):
    492-500

    Three dimensional integration using Through-Silicon Vias (TSVs) offers short inter-layer interconnects and higher packing density. In order to take advantage of these attributes, a novel hybrid 3D NoC-Bus architecture is proposed in the paper. For vertical link, a Fake Token Bus architecture is elaborated, which utilizes the bandwidth efficiently by updating token synchronously. Based on this bus architecture, a methodology of hybrid 3D NoC-Bus design is introduced. The network hybridizes with the bus in vertical link and distributes long links of the full connected network into different layers, which achieves a network with a diameter of only 3 hops and limited radix. In addition, a congestion-aware routing algorithm applied to the hybrid network is proposed. The algorithm routes packets in horizontal firstly when the bus is busy, which balances the communication and reduces the possibility of congestion. Experimental results show that our network can achieve a 34.4% reduction in latency and a 43% reduction in power consumption under uniform random traffic and a 36.9% reduction in latency and a 48% reduction in power consumption under hotspot traffic over regular 3D mesh implementations on average.

  • A Comparison of Pressure and Tilt Input Techniques for Cursor Control

    Xiaolei ZHOU  Xiangshi REN  

     
    PAPER-Human-computer Interaction

      Vol:
    E92-D No:9
      Page(s):
    1683-1691

    Three experiments were conducted in this study to investigate the human ability to control pen pressure and pen tilt input, by coupling this control with cursor position, angle and scale. Comparisons between pen pressure input and pen tilt input have been made in the three experiments. Experimental results show that decreasing pressure input resulted in very poor performance and was not a good input technique for any of the three experiments. In "Experiment 1-Coupling to Cursor Position", the tilt input technique performed relatively better than the increasing pressure input technique in terms of time, even though the tilt technique had a slightly higher error rate. In "Experiment 2-Coupling to Cursor Angle", the tilt input performed a little better than the increasing pressure input in terms of time, but the gap between them is not so apparent as Experiment 1. In "Experiment 3-Coupling to Cursor Scale", tilt input performed a little better than increasing pressure input in terms of adjustment time. Based on the results of our experiments, we have inferred several design implications and guidelines.

  • SPDebugger: A Fine-Grained Deterministic Debugger for Concurrency Code

    Ziyi LIN  Yilei ZHOU  Hao ZHONG  Yuting CHEN  Haibo YU  Jianjun ZHAO  

     
    PAPER-Software Engineering

      Pubricized:
    2016/12/20
      Vol:
    E100-D No:3
      Page(s):
    473-482

    When debugging bugs, programmers often prepare test cases to reproduce buggy behaviours. However, for concurrent programs, test cases alone are typically insufficient to reproduce buggy behaviours, due to the nondeterminism of multi-threaded executions. In literature, various approaches have been proposed to reproduce buggy behaviours for concurrency bugs deterministically, but to the best of our knowledge, they are still limited. In particular, we have recognized three debugging scenarios from programming practice, but existing approaches can handle only one of the scenarios. In this paper, we propose a novel approach, called SPDebugger, that provides finer-grained thread controlling over test cases, programs under test, and even third party library code, to reproduce the predesigned thread execution schedule. The evaluation shows that SPDebugger handles more debugging scenarios than the state-of-the-art tool, called IMUnit, with similar human effort.

  • A Linear Color Correction Method for Compressed Images and Videos

    Kebin AN  Jun SUN  Lei ZHOU  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E89-D No:10
      Page(s):
    2686-2689

    Color correction needs to be performed to improve the quality of image/video production. The typical methods realize the color correction mainly in the spatial domain of RGB color space. In this paper, a linear color correction method in JPEG/MPEG-2 compressed domain is proposed. The correction is realized in the DCT domain of YUV color space without full-frame decompression. Experimental results show that the visual quality of the corrected images/videos in the compressed domain is identical to the quality of the images/videos corrected in the uncompressed domain.