The search functionality is under construction.

Author Search Result

[Author] Jing-Yang JOU(9hit)

1-9hit
  • An Efficient Power Model for IP-Level Complex Designs

    Chih-Yang HSU  Chien-Nan Jimmy LIU  Jing-Yang JOU  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E86-A No:8
      Page(s):
    2073-2080

    In this paper, we propose an efficient IP-Level power model with a small lookup table for complex CMOS circuits. The table has only one dimension that maps the zero-delay charging and discharging capacitance (CDC) into the real power consumption of pattern pairs but still has high accuracy. In order to reduce the table size, we collect those pattern pairs with similar CDC values to be a group and only set an entry in the lookup table for each group. The proposed dynamic grouping process can automatically increase the entries of the lookup tables to cover the current CDC distribution of designs during the power characterization process. In order to improve the efficiency of characterization process, the Monte Carlo approach is used during the estimation for the average power of each group to skip the samples that will not increase the accuracy too much. After the power model of a circuit is built, the average power consumption for any test sequence can be estimated easily. The experimental result shows that the table sizes are only up to 107 entries for ISCAS'85 benchmark circuits and the estimation error is only 2.99% on average using this lookup table.

  • Efficient Vector Compaction Methods for Power Estimation with Consecutive Sampling Techniques

    Chih-Yang HSU  Chien-Nan Jimmy LIU  Jing-Yang JOU  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E87-A No:11
      Page(s):
    2973-2982

    For large circuits, vector compaction techniques could provide a faster solution for power estimation with reasonable accuracy. Because traditional sampling approach will incur useless transitions between every sampled pattern pairs after they are concatenated into a single sequence for simulation, we proposed a vector compaction method with grouping and single-sequence consecutive sampling technique to solve this problem. However, it is very possible that we cannot find a perfect consecutive sequence without any undesired transitions. In such cases, the compaction ratio of the sequence length may not be improved too much. In this paper, we propose an efficient approach to relax the limitation a little bit such that multiple consecutive sequences are allowed. We also propose an algorithm to reduce the number of sequences instead of setting the number as one to find better solutions for vector compaction problem. As demonstrated in the experimental results, the average compaction ratio and speedup can be significantly improved by using this new approach.

  • Power-Efficient Instancy Aware DRAM Scheduling

    Gung-Yu PAN  Chih-Yen LAI  Jing-Yang JOU  Bo-Cheng Charles LAI  

     
    PAPER-Systems and Control

      Vol:
    E98-A No:4
      Page(s):
    942-953

    Nowadays, computer systems are limited by the power and memory wall. As the Dynamic Random Access Memory (DRAM) has dominated the power consumption in modern devices, developing power-saving approaches on DRAM has become more and more important. Among several techniques on different abstract levels, scheduling-based power management policies can be applied to existing memory controllers to reduce power consumption without causing severe performance degradation. Existing power-aware schedulers cluster memory requests into sets, so that the large portion of the DRAM can be switched into the power saving mode; however, only the target addresses are taken into consideration when clustering, while we observe the types (read or write) of requests can play an important role. In this paper, we propose two scheduling-based power management techniques on the DRAM controller: the inter-rank read-write aware clustering approach greatly reduces the active standby power, and the intra-rank read-write aware reordering approach mitigates the performance degradation. The simulation results show that the proposed techniques effectively reduce 75% DRAM power on average. Compared with the existing policy, the power reduction is 10% more on average with comparable or less performance degradation for the proposed techniques.

  • Internet-Based Hierarchical Floorplan Design

    Jiann-Horng LIN  Jing-Yang JOU  Iris Hui-Ru JIANG  

     
    PAPER

      Vol:
    E82-A No:11
      Page(s):
    2414-2423

    With the proliferation of the transistor count in VLSI design, more and more design groups try to figure out an efficient way to combine their designs. The Internet features distributed computing and resource sharing. Consequently, a hierarchical design can adequately be solved in the Internet environment. In this paper, we demonstrate the facilitation of the Internet environment by solving the area minimization floorplan problem. We propose the RMG algorithm taking advantage of the Internet. Based on the model of transfer latencies, the RMG algorithm reduces the computing time by shortening the critical path in the floorplan tree. Our experimental results show that the Internet is suitable for Electronic Design Automation (EDA).

  • Delay-Optimal Technology Mapping for Hard-Wired Non-Homogeneous FPGAs

    Hsien-Ho CHUANG  Jing-Yang JOU  C. Bernard SHUNG  

     
    PAPER-Performance Optimization

      Vol:
    E83-A No:12
      Page(s):
    2545-2551

    A delay-optimal technology mapping algorithm is developed on a general model of FPGA with hard-wired non-homogeneous logic block architectures which is composed of different sizes of look-up tables (LUTs) hard-wired together. This architecture has the advantages of short delay of hard-wired connections and area-efficiency of non-homogeneous structure. The Xilinx XC4000 is one commercial example, where two 4-LUTs are hard-wired to one 3-LUT. In this paper, we present a two-dimensional labeling approach and a level-2 node cut algorithm to handle the hard-wired feature. The experimental results show that our algorithm generates favorable results for Xilinx XC4000 CLBs. Over a set of MCNC benchmarks, our algorithm produces results with 17% fewer CLB depth than that of FlowMap in similar CPU time on average, and with 4% fewer CLB depth than that of PDDMAP on average while PDDMAP needs 15 times more CPU time.

  • A Variable Partitioning Algorithm of BDD for FPGA Technology Mapping

    Jie-Hong JIANG  Jing-Yang JOU  Juinn-Dar HUANG  Jung-Shian WEI  

     
    PAPER

      Vol:
    E80-A No:10
      Page(s):
    1813-1819

    Field Programmable Gate Arrays (FPGA's) are important devices for rapid system prototyping. Roth-Karp decomposition is one of the most popular decomposition techniques for Look-Up Table (LUT) -based FPGA technology mapping. In this paper, we propose a novel algorithm based on Binary Decision Diagrams (BDD's) for selecting good lambda set variables in Roth-Karp decomposition to minimize the number of consumed configurable logic blocks (CLB's) in FPGA's. The experimental results on a set of benchmarks show that our algorithm can produce much better results than the similar works of the previous approaches.

  • A New Method for Constructing IP Level Power Model Based on Power Sensitivity

    Heng-Liang HUANG  Jiing-Yuan LIN  Wen-Zen SHEN  Jing-Yang JOU  

     
    PAPER-VLSI Design Methodology

      Vol:
    E83-A No:12
      Page(s):
    2431-2438

    As the function of a system getting more complex, IP (Intellectual Property) reusing is the trend of system design style. Designers need to evaluate the performance and features of every candidate IP block that can be used in their design, while IP providers hope to keep the structure of their IP blocks a secret. An IP level power model is a model that takes only the primary input statistics as parameters and does not reveal any information about the sizes of the transistors or the structure of the circuit. This paper proposes a new method for constructing power model that is suitable for IP level circuit blocks. It is a nominal point selection method for power models based on power sensitivities. By analyzing the relationship between the dynamic power consumption of CMOS circuits and their input signal statistics, a guideline of selecting the nominal point is proposed. From our analysis, the first nominal point is selected to minimize the average estimation error and two other nominal points are selected to minimize the maximum estimation error. Our experimental results on a number of benchmark circuits show the effectiveness of the proposed method. Average estimation accuracy within 5.78% of transistor level simulations is achieved. The proposed method can be applied to build a system level power estimation environment without revealing the contents of the IP blocks inside. Thereby, it is a promising method for IP level power model construction.

  • ILP-Based Bitwidth-Aware Subexpression Sharing for Area Minimization in Multiple Constant Multiplication

    Bu-Ching LIN  Juinn-Dar HUANG  Jing-Yang JOU  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E97-A No:4
      Page(s):
    931-939

    The notion of multiple constant multiplication (MCM) is extensively adopted in digital signal processing (DSP) applications such as finite impulse filter (FIR) designs. A set of adders is utilized to replace regular multipliers for the multiplications between input data and constant filter coefficients. Though many algorithms have been proposed to reduce the total number of adders in an MCM block for area minimization, they do not consider the actual bitwidth of each adder, which may not estimate the hardware cost well enough. Therefore, in this article we propose a bitwidth-aware MCM optimization algorithm that focuses on minimizing the total number of adder bits rather than the adder count. It first builds a subexpression graph based on the given coefficients, derives a set of constraints for adder bitwidth minimization, and then optimally solves the problem through integer linear programming (ILP). Experimental results show that the proposed algorithm can effectively reduce the required adder bit count and outperforms the existing state-of-the-art techniques.

  • Performance-Driven Architectural Synthesis for Distributed Register-File Microarchitecture with Inter-Island Delay

    Juinn-Dar HUANG  Chia-I CHEN  Wan-Ling HSU  Yen-Ting LIN  Jing-Yang JOU  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E95-A No:2
      Page(s):
    559-566

    In deep-submicron era, wire delay is becoming a bottleneck while pursuing higher system clock speed. Several distributed register (DR) architectures are proposed to cope with this problem by keeping most wires local. In this article, we propose the distributed register-file microarchitecture with inter-island delay (DRFM-IID). Though DRFM-IID is also one of the DR-based architectures, it is considered more practical than the previously proposed DRFM, in terms of delay model. With such delay consideration, the synthesis task is inherently more complicated than the one without inter-island delay concern since uncertain interconnect latency is very likely to seriously impact on the whole system performance. Therefore we also develop a performance-driven architectural synthesis framework targeting DRFM-IID. Several factors for evaluating the quality of results, such as number of inter-island transfers, timing-criticality of transfer, and resource utilization balancing, are adopted as the guidance while performing architectural synthesis for better optimization outcomes. The experimental results show that the latency and the number of inter-cluster transfers can be reduced by 26.9% and 37.5% on average; and the latter is commonly regarded as an indicator for power consumption of on-chip communication.