The search functionality is under construction.
The search functionality is under construction.

IEICE TRANSACTIONS on Fundamentals

  • Impact Factor

    0.40

  • Eigenfactor

    0.003

  • article influence

    0.1

  • Cite Score

    1.1

Advance publication (published online immediately after acceptance)

Volume E92-A No.12  (Publication Date:2009/12/01)

    Special Section on VLSI Design and CAD Algorithms
  • FOREWORD

    Shinji KIMURA  

     
    FOREWORD

      Page(s):
    2961-2961
  • Practical Redundant-Via Insertion Method Considering Manufacturing Variability and Reliability

    Yuji TAKASHIMA  Kazuyuki OOYA  Atsushi KUROKAWA  

     
    PAPER-Physical Level Desing

      Page(s):
    2962-2970

    As the integrated circuit technology has undergone continuous downscaling to improve the LSI performance and reduce chip size, design for manufacturability (DFM) and design for yield (DFY) have become very important. As one of the DFM/DFY methods, a redundant via insertion technique uses as many vias as possible to connect the metal wires between different layers. In this paper, we focus on redundant vias and propose an effective redundant via insertion method for practical use to address the manufacturing variability and reliability concerns. First, the results of statistical analysis for via resistance and via capacitance in some real physical layouts are shown, and the impact on circuit delay of the resistance variation of vias caused by manufacturing variability is clarified. Then, the valuation functions of delay variation, electro-migration (EM), and stress-migration (SM) are defined and a practical method concerning redundant via insertion is proposed. Experimental results show that LSI with redundant vias inserted by our method robust against manufacturing variability and reliability problems.

  • A Fast Longer Path Algorithm for Routing Grid with Obstacles Using Biconnectivity Based Length Upper Bound

    Yukihide KOHIRA  Suguru SUEHIRO  Atsushi TAKAHASHI  

     
    PAPER-Physical Level Desing

      Page(s):
    2971-2978

    In recent VLSI systems, signal propagation delays are requested to achieve the specifications with very high accuracy. In order to meet the specifications, the routing of a net often needs to be detoured in order to increase the routing delay. A routing method should utilize a routing area with obstacles as much as possible in order to realize the specifications of nets simultaneously. In this paper, a fast longer path algorithm that generates a path of a net in routing grid so that the length is increased as much as possible is proposed. In the proposed algorithm, an upper bound for the length in which the structure of a routing area is taken into account is used. Experiments show that our algorithm utilizes a routing area with obstacles efficiently.

  • Thermal-Aware Incremental Floorplanning for 3D ICs Based on MILP Formulation

    Yuchun MA  Xin LI  Yu WANG  Xianlong HONG  

     
    PAPER-Physical Level Desing

      Page(s):
    2979-2989

    In 3D IC design, thermal issue is a critical challenge. To eliminate hotspots, physical layouts are always adjusted by some incremental changes, such as shifting or duplicating hot blocks. In this paper, we distinguish the thermal-aware incremental changes in three different categories: migrating computation, growing unit and moving hotspot blocks. However, these modifications may degrade the packing area as well as interconnect distribution greatly. In this paper, mixed integer linear programming (MILP) models are devised according to these different incremental changes so that multiple objectives can be optimized simultaneously. Furthermore, to avoid random incremental modification, which may be inefficient and need long runtime to converge, here potential gain is modeled for each candidate incremental change. Based on the potential gain, a novel thermal optimization flow to intelligently choose the best incremental operation is presented. Experimental results show that migrating computation, growing unit and moving hotspot can reduce max on-chip temperature by 7%, 13% and 15% respectively on MCNC/GSRC benchmarks. Still, experimental results also show that the thermal optimization flow can reduce max on-chip temperature by 14% to the initial packings generated by an existing 3D floorplanning tool CBA, and achieve better area and total wirelength improvement than individual operations do. The results with the initial packings from CBA_T (Thermal-aware CBA floorplanner) show that 13.5% temperature reduction can be obtained by our incremental optimization flow.

  • Voltage and Level-Shifter Assignment Driven Floorplanning

    Bei YU  Sheqin DONG  Song CHEN  Satoshi GOTO  

     
    PAPER-Physical Level Desing

      Page(s):
    2990-2997

    Low Power Design has become a significant requirement when the CMOS technology entered the nanometer era. Multiple-Supply Voltage (MSV) is a popular and effective method for both dynamic and static power reduction while maintaining performance. Level shifters may cause area and Interconnect Length Overhead (ILO), and should be considered at both floorplanning and post-floorplanning stages. In this paper, we propose a two phases algorithm framework, called VLSAF, to solve voltage and level shifter assignment problem. At floorplanning phase, we use a convex cost network flow algorithm to assign voltage and a minimum cost flow algorithm to handle level-shifter assignment. At post-floorplanning phase, a heuristic method is adopted to redistribute white spaces and calculate the positions and shapes of level shifters. The experimental results show VLSAF is effective.

  • MILP-Based Efficient Routing Method with Restricted Route Structure for 2-Layer Ball Grid Array Packages

    Yoichi TOMIOKA  Yoshiaki KURATA  Yukihide KOHIRA  Atsushi TAKAHASHI  

     
    PAPER-Physical Level Desing

      Page(s):
    2998-3006

    In this paper, we propose a routing method for 2-layer ball grid array packages that generates a routing pattern satisfying a design rule. In our proposed method, the routing structure on each layer is restricted while keeping most of feasible patterns to efficiently obtain a feasible routing pattern. A routing pattern that satisfies the design rule is formulated as a mixed integer linear programming. In experiments with seven data, we obtain a routing pattern such that satisfies the design rule within a practical time by using a mixed integer linear programming solver.

  • Intra-Die Spatial Correlation Extraction with Maximum Likelihood Estimation Method for Multiple Test Chips

    Qiang FU  Wai-Shing LUK  Jun TAO  Xuan ZENG  Wei CAI  

     
    PAPER-Device and Circuit Modeling and Analysis

      Page(s):
    3007-3015

    In this paper, a novel intra-die spatial correlation extraction method referred to as MLEMTC (Maximum Likelihood Estimation for Multiple Test Chips) is presented. In the MLEMTC method, a joint likelihood function is formulated by multiplying the set of individual likelihood functions for all test chips. This joint likelihood function is then maximized to extract a unique group of parameter values of a single spatial correlation function, which can be used for statistical circuit analysis and design. Moreover, to deal with the purely random component and measurement error contained in measurement data, the spatial correlation function combined with the correlation of white noise is used in the extraction, which significantly improves the accuracy of the extraction results. Furthermore, an LU decomposition based technique is developed to calculate the log-determinant of the positive definite matrix within the likelihood function, which solves the numerical stability problem encountered in the direct calculation. Experimental results have shown that the proposed method is efficient and practical.

  • An Approach for Reducing Leakage Current Variation due to Manufacturing Variability

    Tsuyoshi SAKATA  Takaaki OKUMURA  Atsushi KUROKAWA  Hidenari NAKASHIMA  Hiroo MASUDA  Takashi SATO  Masanori HASHIMOTO  Koutaro HACHIYA  Katsuhiro FURUKAWA  Masakazu TANAKA  Hiroshi TAKAFUJI  Toshiki KANAMOTO  

     
    PAPER-Device and Circuit Modeling and Analysis

      Page(s):
    3016-3023

    Leakage current is an important qualitative metric of LSI (Large Scale Integrated circuit). In this paper, we focus on reduction of leakage current variation under the process variation. Firstly, we derive a set of quadratic equations to evaluate delay and leakage current under the process variation. Using these equations, we discuss the cases of varying leakage current without degrading delay distribution and propose a procedure to reduce the leakage current variations. From the experiments, we show the proposed method effectively reduces the leakage current variation up to 50% at 90 percentile point of the distribution compared with the conventional design approach.

  • A Modified Nested Sparse Grid Based Adaptive Stochastic Collocation Method for Statistical Static Timing Analysis

    Xu LUO  Fan YANG  Xuan ZENG  Jun TAO  Hengliang ZHU  Wei CAI  

     
    PAPER-Device and Circuit Modeling and Analysis

      Page(s):
    3024-3034

    In this paper, we propose a Modified nested sparse grid based Adaptive Stochastic Collocation Method (MASCM) for block-based Statistical Static Timing Analysis (SSTA). The proposed MASCM employs an improved adaptive strategy derived from the existing Adaptive Stochastic Collocation Method (ASCM) to approximate the key operator MAX during timing analysis. In contrast to ASCM which uses non-nested sparse grid and tensor product quadratures to approximate the MAX operator for weakly and strongly nonlinear conditions respectively, MASCM proposes a modified nested sparse grid quadrature to approximate the MAX operator for both weakly and strongly nonlinear conditions. In the modified nested sparse grid quadrature, we firstly construct the second order quadrature points based on extended Gauss-Hermite quadrature and nested sparse grid technique, and then discard those quadrature points that do not contribute significantly to the computation accuracy to enhance the efficiency of the MAX approximation. Compared with the non-nested sparse grid quadrature, the proposed modified nested sparse grid quadrature not only employs much fewer collocation points, but also offers much higher accuracy. Compared with the tensor product quadrature, the modified nested sparse grid quadrature greatly reduced the computational cost, while still maintains sufficient accuracy for the MAX operator approximation. As a result, the proposed MASCM provides comparable accuracy while remarkably reduces the computational cost compared with ASCM. The numerical results show that with comparable accuracy MASCM has 50% reduction in run time compared with ASCM.

  • Find the 'Best' Solution from Multiple Analog Topologies via Pareto-Optimality

    Yu LIU  Masato YOSHIOKA  Katsumi HOMMA  Toshiyuki SHIBUYA  

     
    PAPER-Device and Circuit Modeling and Analysis

      Page(s):
    3035-3043

    This paper presents a novel method using multi-objective optimization algorithm to automatically find the best solution from a topology library of analog circuits. Firstly this method abstracts the Pareto-front of each topology in the library by SPICE simulation. Then, the Pareto-front of the topology library is abstracted from the individual Pareto-fronts of topologies in the library followed by the theorem we proved. The best solution which is defined as the nearest point to specification on the Pareto-front of the topology library is then calculated by the equations derived from collinearity theorem. After the local searching using Nelder-Mead method maps the calculated best solution backs to design variable space, the non-dominated best solution is obtained. Comparing to the traditional optimization methods using single-objective optimization algorithms, this work can efficiently find the best non-dominated solution from multiple topologies for different specifications without additional time-consuming optimizing iterations. The experiments demonstrate that this method is feasible and practical in actual analog designs especially for uncertain or variant multi-dimensional specifications.

  • Design of Voltage-Mode MAX-MIN Circuits with Low Area and Low Power Consumption

    Mohammad SOLEIMANI  Abdollah KHOEI  Khayrollah HADIDI  Vahid Fagih DINAVARI  

     
    PAPER-Device and Circuit Modeling and Analysis

      Page(s):
    3044-3051

    In this paper, new structure of Voltage-Mode MAX-MIN circuit are presented for nonlinear systems, fuzzy applications, neural network and etc. A differential pair with improved cascode current mirror is used to choose the desired input. The advantages of the proposed structure are high operating frequency, high precision, low power consumption, low area and simple expansion for multiple inputs by adding only three transistors for each extra input. The proposed circuit which is simulated by HSPICE in 0.35 µm CMOS process shows the total power consumption of 85 µW in 5 MHz operating frequency from a single 3.3-V supply. Also, the total area of the proposed circuit is about 420 µm2 for two input voltages, and would be negligibly increased for each extra input.

  • Fast Shape Optimization of Metalization Patterns for Power-MOSFET Based Driver

    Bo YANG  Shigetoshi NAKATAKE  

     
    PAPER-Device and Circuit Modeling and Analysis

      Page(s):
    3052-3060

    This paper addresses the problem of optimizing metalization patterns of back-end connections for the power-MOSFET based driver since the back-end connections tend to dominate the on-resistance Ron of the driver. We propose a heuristic algorithm to seek for better geometric shapes for the patterns targeting at minimizing Ron and at balancing the current distribution. In order to speed up the analysis, the equivalent resistance network of the driver is modified by inserting ideal switches to avoid repeatedly inverting the admittance matrix. With the behavioral model of the ideal switch, we can significantly accelerate the optimization. Simulation on three drivers from industrial TEG data demonstrates that our algorithm can reduce Ron effectively by shaping metals appropriately within a given routing area.

  • Fast Analysis of On-Chip Power Grid Circuits by Extended Truncated Balanced Realization Method

    Duo LI  Sheldon X.-D. TAN  

     
    PAPER-Device and Circuit Modeling and Analysis

      Page(s):
    3061-3069

    In this paper, we present a novel analysis approach for large on-chip power grid circuit analysis. The new approach, called ETBR for extended truncated balanced realization, is based on model order reduction techniques to reduce the circuit matrices before the simulation. Different from the (improved) extended Krylov subspace methods EKS/IEKS, ETBR performs fast truncated balanced realization on response Gramian to reduce the original system. ETBR also avoids the adverse explicit moment representation of the input signals. Instead, it uses spectrum representation in frequency domain for input signals by fast Fourier transformation. The proposed method is very amenable for threading-based parallel computing, as the response Gramian is computed in a Monte-Carlo-like sampling style and each sampling can be computed in parallel. This contrasts with all the Krylov subspace based methods like the EKS method, where moments have to be computed in a sequential order. ETBR is also more flexible for different types of input sources and can better capture the high frequency contents than EKS, and this leads to more accurate results especially for fast changing input signals. Experimental results on a number of large networks (up to one million nodes) show that, given the same order of the reduced model, ETBR is indeed more accurate than the EKS method especially for input sources rich in high-frequency components. If parallel computing is explored, ETBR can be an order of magnitude faster than the EKS/IEKS method.

  • Statistical Gate Delay Model for Multiple Input Switching

    Takayuki FUKUOKA  Akira TSUCHIYA  Hidetoshi ONODERA  

     
    PAPER-Device and Circuit Modeling and Analysis

      Page(s):
    3070-3078

    In this paper, we propose a calculation method of gate delay for SSTA (Statistical Static Timing Analysis) considering MIS (Multiple Input Switching). In SSTA, statistical maximum/minimum operation is necessary to calculate the latest/fastest arrival time of multiple input gate. Most SSTA approaches calculate the distribution in the latest/fastest arrival time under SIS (Single Input Switching assumption), resulting in ignoring the effect of MIS on the gate delay and the output transition time. MIS occurs when multiple inputs of a gate switch nearly simultaneously. Thus, ignoring MIS causes error in the statistical maximum/minimum operation in SSTA. We propose a statistical gate delay model considering MIS. We verify the proposed method by SPICE based Monte Carlo simulations. Experimental results show that the neglect of MIS effect leads to 80% error in worst case. The error of the proposed method is less than 20%.

  • Low-Voltage Process-Compensated VCO with On-Chip Process Monitoring and Body-Biasing Circuit Techniques

    Ken UENO  Tetsuya HIROSE  Tetsuya ASAI  Yoshihito AMEMIYA  

     
    LETTER-Device and Circuit Modeling and Analysis

      Page(s):
    3079-3081

    A voltage-controlled oscillator (VCO) tolerant to process variations at lower supply voltage was proposed. The circuit consists of an on-chip threshold-voltage-monitoring circuit, a current-source circuit, a body- biasing control circuit, and the delay cells of the VCO. Because variations in low-voltage VCO frequency are mainly determined by that of the current in delay cells, a current-compensation technique was adopted by using an on-chip threshold-voltage-monitoring circuit and body-biasing circuit techniques. Monte Carlo SPICE simulations demonstrated that variations in the oscillation frequency by using the proposed techniques were able to be suppressed about 65% at a 1-V supply voltage, compared to frequencies with and without the techniques.

  • Accurate Systematic Hot-Spot Scoring Method and Score-Based Fixing Guidance Generation

    Yonghee PARK  Junghoe CHOI  Jisuk HONG  Sanghoon LEE  Moonhyun YOO  Jundong CHO  

     
    LETTER-Device and Circuit Modeling and Analysis

      Page(s):
    3082-3085

    The researches on predicting and removing of lithographic hot-spots have been prevalent in recent semiconductor industries, and known to be one of the most difficult challenges to achieve high quality detection coverage. To provide physical design implementation with designer's favors on fixing hot-spots, in this paper, we present a noble and accurate hot-spot detection method, so-called "leveling and scoring" algorithm based on weighted combination of image quality parameters (i.e., normalized image log-slope (NILS), mask error enhancement factor (MEEF), and depth of focus (DOF)) from lithography simulation. In our algorithm, firstly, hot-spot scoring function considering severity level is calibrated with process window qualification, and then least-square regression method is used to calibrate weighting coefficients for each image quality parameter. In this way, after we obtain the scoring function with wafer results, our method can be applied to future designs of using the same process. Using this calibrated scoring function, we can successfully generate fixing guidance and rule to detect hot-spot area by locating edge bias value which leads to a hot-spot-free score level. Finally, we integrate the hot-spot fixing guidance information into layout editor to facilitate the user-favorable design environment. Applying our method to memory devices of 60 nm node and below, we could successfully attain sufficient process window margin to yield high mass production.

  • Constrained Stimulus Generation with Self-Adjusting Using Tabu Search with Memory

    Yanni ZHAO  Jinian BIAN  Shujun DENG  Zhiqiu KONG  Kang ZHAO  

     
    PAPER-Logic Synthesis, Test and Verfication

      Page(s):
    3086-3093

    Despite the growing research effort in formal verification, industrial verification often relies on the constrained random simulation methodology, which is supported by constraint solvers as the stimulus generator integrated within simulator, especially for the large design with complex constraints nowadays. These stimulus generators need to be fast and well-distributed to maintain simulation performance. In this paper, we propose a dynamic method to guide stimulus generation by SAT solvers. An adjusting strategy named Tabu Search with Memory (TSwM) is integrated in the stimulus generator for the search and prune processes along with the constraint solver. Experimental results show that the method proposed in this paper could generate well-distributed stimuli with good performance.

  • Trade-Off Analysis between Timing Error Rate and Power Dissipation for Adaptive Speed Control with Timing Error Prediction

    Hiroshi FUKETA  Masanori HASHIMOTO  Yukio MITSUYAMA  Takao ONOYE  

     
    PAPER-Logic Synthesis, Test and Verfication

      Page(s):
    3094-3102

    Timing margin of a chip varies chip by chip due to manufacturing variability, and depends on operating environment and aging. Adaptive speed control with timing error prediction is promising to mitigate the timing margin variation, whereas it inherently has a critical risk of timing error occurrence when a circuit is slowed down. This paper presents how to evaluate the relation between timing error rate and power dissipation in self-adaptive circuits with timing error prediction. The discussion is experimentally validated using adders in subthreshold operation in a 90 nm CMOS process. We show a trade-off between timing error rate and power dissipation, and reveal the dependency of the trade-off on design parameters.

  • Incremental Buffer Insertion and Module Resizing Algorithm Using Geometric Programming

    Qing DONG  Bo YANG  Jing LI  Shigetoshi NAKATAKE  

     
    PAPER-Logic Synthesis, Test and Verfication

      Page(s):
    3103-3110

    This paper presents an efficient algorithm for incremental buffer insertion and module resizing for a full-placed floorplan. Our algorithm offers a method to use the white space in a given floorplan to resize modules and insert buffers, and at the same time keeps the resultant floorplan as close to the original one as possible. Both the buffer insertion and module resizing are modeled as geometric programming problems, and can be solved extremely efficiently using new developed solution methods. The experimental results suggest that the the wire length difference between the initial floorplan and result are quite small (less than 5%), and the global structure of the initial floorplan are preserved very well.

  • Optimizing Controlling-Value-Based Power Gating with Gate Count and Switching Activity

    Lei CHEN  Shinji KIMURA  

     
    PAPER-Logic Synthesis, Test and Verfication

      Page(s):
    3111-3118

    In this paper, a new heuristic algorithm is proposed to optimize the power domain clustering in controlling-value-based (CV-based) power gating technology. In this algorithm, both the switching activity of sleep signals (p) and the overall numbers of sleep gates (gate count, N) are considered, and the sum of the product of p and N is optimized. The algorithm effectively exerts the total power reduction obtained from the CV-based power gating. Even when the maximum depth is kept to be the same, the proposed algorithm can still achieve power reduction approximately 10% more than that of the prior algorithms. Furthermore, detailed comparison between the proposed heuristic algorithm and other possible heuristic algorithms are also presented. HSPICE simulation results show that over 26% of total power reduction can be obtained by using the new heuristic algorithm. In addition, the effect of dynamic power reduction through the CV-based power gating method and the delay overhead caused by the switching of sleep transistors are also shown in this paper.

  • X-Handling for Current X-Tolerant Compactors with More Unknowns and Maximal Compaction

    Youhua SHI  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Logic Synthesis, Test and Verfication

      Page(s):
    3119-3127

    This paper presents a novel X-handling technique, which removes the effect of unknowns on compacted test response with maximal compaction ratio. The proposed method combines with the current X-tolerant compactors and inserts masking cells on scan paths to selectively mask X's. By doing this, the number of unknown responses in each scan-out cycle could be reduced to a reasonable level such that the target X-tolerant compactor would tolerate with guaranteed possible error detection. It guarantees no test loss due to the effect of X's, and achieves the maximal compaction that the target response compactor could provide as well. Moreover, because the masking cells are only inserted on the scan paths, it has no performance degradation of the designs. Experimental results demonstrate the effectiveness of the proposed method.

  • Addressing Defect Coverage through Generating Test Vectors for Transistor Defects

    Yoshinobu HIGAMI  Kewal K. SALUJA  Hiroshi TAKAHASHI  Shin-ya KOBAYASHI  Yuzo TAKAMATSU  

     
    PAPER-Logic Synthesis, Test and Verfication

      Page(s):
    3128-3135

    Shorts and opens are two major kind of defects that are most likely to occur in Very Large Scale Integrated Circuits. In modern Integrated Circuit devices these defects must be considered not only at gate-level but also at transistor level. In this paper, we propose a method for generating test vectors that targets both transistor shorts (tr-shorts) and transistor opens (tr-opens). Since two consecutive test vectors need to be applied in order to detect tr-opens, we assume launch on capture (LOC) test application mechanism. This makes it possible to detect delay type defects. Further, the proposed method employs existing stuck-at test generation tools thus requiring no change in the design and development flow and development of no new tools is needed. Experimental results for benchmark circuits demonstrate the effectiveness of the proposed method by providing 100% fault efficiency while the test set size is still moderate.

  • An Error Diagnosis Technique Based on Location Sets to Rectify Subcircuits

    Kosuke SHIOKI  Narumi OKADA  Toshiro ISHIHARA  Tetsuya HIROSE  Nobutaka KUROKI  Masahiro NUMA  

     
    PAPER-Logic Synthesis, Test and Verfication

      Page(s):
    3136-3142

    This paper presents an error diagnosis technique for incremental synthesis, called EXLLS (Extended X-algorithm for LUT-based circuit model based on Location sets to rectify Subcircuits), which rectifies five or more functional errors in the whole circuit based on location sets to rectify subcircuits. Conventional error diagnosis technique, called EXLIT, tries to rectify five or more functional errors based on incremental rectification for subcircuits. However, the solution depends on the selection and the order of modifications on subcircuits, which increases the number of locations to be changed. To overcome this problem, we propose EXLLS based on location sets to rectify subcircuits, which obtains two or more solutions by separating i) extraction of location sets to be rectified, and ii) rectification for the whole circuit based on the location sets. Thereby EXLLS can rectify five or more errors with fewer locations to change. Experimental results have shown that EXLLS reduces increase in the number of locations to be rectified with conventional technique by 90.1%.

  • Communication Synthesis for Interconnect Minimization in Multicycle Communication Architecture

    Ya-Shih HUANG  Yu-Ju HONG  Juinn-Dar HUANG  

     
    PAPER-High-Level Synthesis and System-Level Design

      Page(s):
    3143-3150

    In deep-submicron technology, several state-of-the-art architectural synthesis flows have already adopted the distributed register architecture to cope with the increasing wire delay by allowing multicycle communication. In this article, we regard communication synthesis targeting a refined regular distributed register architecture, named RDR-GRS, as a problem of simultaneous data transfer routing and scheduling for global interconnect resource minimization. We also present an innovative algorithm with regard of both spatial and temporal perspectives. It features both a concentration-oriented path router gathering wire-sharable data transfers and a channel-based time scheduler resolving contentions for wires in a channel, which are in spatial and temporal domain, respectively. The experimental results show that the proposed algorithm can significantly outperform existing related works.

  • Peak Temperature Reduction by Physical Information Driven Behavioral Synthesis with Resource Usage Allocation

    Junbo YU  Qiang ZHOU  Gang QU  Jinian BIAN  

     
    PAPER-High-Level Synthesis and System-Level Design

      Page(s):
    3151-3159

    High temperature adversely impacts on circuit's reliability, performance, and leakage power. During behavioral synthesis, both resource usage allocation and resource binding influence thermal profile. Current thermal-aware behavioral syntheses do not utilize location information of resources from floorplan and in addition only focus on binding, ignoring allocation. This paper proposes thermal-aware behavioral synthesis with resource usage allocation. Based on a hybrid metric of physical location information and temperature, we rebind operations and reallocate the number of resources under area constraint. Our approach effectively controls peak temperature and creates even power densities among resources of different types and within resources of the same type. Experimental results show an average of 8.6 drop in peak temperature and 5.3% saving of total power consumption with little latency overhead.

  • Energy-Aware Memory Allocation Framework for Embedded Data-Intensive Signal Processing Applications

    Florin BALASA  Ilie I. LUICAN  Hongwei ZHU  Doru V. NASUI  

     
    PAPER-High-Level Synthesis and System-Level Design

      Page(s):
    3160-3168

    Many signal processing systems, particularly in the multimedia and telecommunication domains, are synthesized to execute data-intensive applications: their cost related aspects -- namely power consumption and chip area -- are heavily influenced, if not dominated, by the data access and storage aspects. This paper presents an energy-aware memory allocation methodology. Starting from the high-level behavioral specification of a given application, this framework performs the assignment of the multidimensional signals to the memory layers -- the on-chip scratch-pad memory and the off-chip main memory -- the goal being the reduction of the dynamic energy consumption in the memory subsystem. Based on the assignment results, the framework subsequently performs the mapping of signals into both memory layers such that the overall amount of data storage be reduced. This software system yields a complete allocation solution: the exact storage amount on each memory layer, the mapping functions that determine the exact locations for any array element (scalar signal) in the specification, and an estimation of the dynamic energy consumption in the memory subsystem.

  • Floorplan-Aware High-Level Synthesis for Generalized Distributed-Register Architectures

    Akira OHCHI  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-High-Level Synthesis and System-Level Design

      Page(s):
    3169-3179

    As device feature size decreases, interconnection delay becomes the dominating factor of circuit total delay. Distributed-register architectures can reduce the influence of interconnection delay. They may, however, increase circuit area because they require many local registers. Moreover original distributed-register architectures do not consider control signal delay, which may be the bottleneck in a circuit. In this paper, we propose a high-level synthesis method targeting generalized distributed-register architecture in which we introduce shared/local registers and global/local controllers. Our method is based on iterative improvement of scheduling/binding and floorplanning. First, we prepare shared-register groups with global controllers, each of which corresponds to a single functional unit. As iterations proceed, we use local registers and local controllers for functional units on a critical path. Shared-register groups physically located close to each other are merged into a single group. Accordingly, global controllers are merged. Finally, our method obtains a generalized distributed-register architecture where its scheduling/binding as well as floorplanning are simultaneously optimized. Experimental results show that the area is decreased by 4.7% while maintaining the performance of the circuit equal with that using original distributed-register architectures.

  • Low-Power Embedded Processor Design Using Branch Direction

    Gi-Ho PARK  Jung-Wook PARK  Gunok JUNG  Shin-Dug KIM  

     
    LETTER-High-Level Synthesis and System-Level Design

      Page(s):
    3180-3181

    This paper presents a wordline gating logic for reducing unnecessary BTB accesses. Partial bit of the branch predictor was simultaneously recorded in the middle of BTB to prevent further SRAM operation. Experimental results with embedded applications showed that the proposed mechanism reduces around 38% of BTB power consumption.

  • Rapid Design Space Exploration of a Reconfigurable Instruction-Set Processor

    Farhad MEHDIPOUR  Hamid NOORI  Koji INOUE  Kazuaki MURAKAMI  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Page(s):
    3182-3192

    Multitude parameters in the design process of a reconfigurable instruction-set processor (RISP) may lead to a large design space and remarkable complexity. Quantitative design approach uses the data collected from applications to satisfy design constraints and optimize the design goals while considering the applications' characteristics; however it highly depends on designer observations and analyses. Exploring design space can be considered as an effective technique to find a proper balance among various design parameters. Indeed, this approach would be computationally expensive when the performance evaluation of the design points is accomplished based on the synthesis-and-simulation technique. A combined analytical and simulation-based model (CAnSO**) is proposed and validated for performance evaluation of a typical RISP. The proposed model consists of an analytical core that incorporates statistics collected from cycle-accurate simulation to make a reasonable evaluation and provide a valuable insight. CAnSO has clear speed advantages and therefore it can be used for easing a cumbersome design space exploration of a reconfigurable RISP processor and quick performance evaluation of slightly modified architectures.

  • A System-Level Model of Design Space Exploration for a Tile-Based 3D Graphics SoC Refinement

    Liang-Bi CHEN  Chi-Tsai YEH  Hung-Yu CHEN  Ing-Jer HUANG  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Page(s):
    3193-3202

    3D graphics application is widely used in consumer electronics which is an inevitable tendency in the future. In general, the higher abstraction level is used to model a complex system like 3D graphics SoC. However, the concerned issue is that how to use efficient methods to traverse design space hierarchically, reduce simulation time, and refine the performance fast. This paper demonstrates a system-level design space exploration model for a tile-based 3D graphics SoC refinement. This model uses UML tools which can assist designers to traverse the whole system and reduces simulation time dramatically by adopting SystemC. As a result, the system performance is improved 198% at geometry function and 69% at rendering function, respectively.

  • A 48 Cycles/MB H.264/AVC Deblocking Filter Architecture for Ultra High Definition Applications

    Dajiang ZHOU  Jinjia ZHOU  Jiayi ZHU  Satoshi GOTO  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Page(s):
    3203-3210

    In this paper, a highly parallel deblocking filter architecture for H.264/AVC is proposed to process one macroblock in 48 clock cycles and give real-time support to QFHD@60 fps sequences at less than 100 MHz. 4 edge filters organized in 2 groups for simultaneously processing vertical and horizontal edges are applied in this architecture to enhance its throughput. While parallelism increases, pipeline hazards arise owing to the latency of edge filters and data dependency of deblocking algorithm. To solve this problem, a zig-zag processing schedule is proposed to eliminate the pipeline bubbles. Data path of the architecture is then derived according to the processing schedule and optimized through data flow merging, so as to minimize the cost of logic and internal buffer. Meanwhile, the architecture's data input rate is designed to be identical to its throughput, while the transmission order of input data can also match the zig-zag processing schedule. Therefore no intercommunication buffer is required between the deblocking filter and its previous component for speed matching or data reordering. As a result, only one 2464 two-port SRAM as internal buffer is required in this design. When synthesized with SMIC 130 nm process, the architecture costs a gate count of 30.2 k, which is competitive considering its high performance.

  • Worst-Case Flit and Packet Delay Bounds in Wormhole Networks on Chip

    Yue QIAN  Zhonghai LU  Wenhua DOU  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Page(s):
    3211-3220

    We investigate per-flow flit and packet worst-case delay bounds in on-chip wormhole networks. Such investigation is essential in order to provide guarantees under worst-case conditions in cost-constrained systems, as required by many hard real-time embedded applications. We first propose analysis models for flow control, link and buffer sharing. Based on these analysis models, we obtain an open-ended service analysis model capturing the combined effect of flow control, link and buffer sharing. With the service analysis model, we compute equivalent service curves for individual flows, and then derive their flit and packet delay bounds. Our experimental results verify that our analytical bounds are correct and tight.

  • Low Cost Design of an Advanced Encryption Standard (AES) Processor Using a New Common-Subexpression-Elimination Algorithm

    Ming-Chih CHEN  Shen-Fu HSIAO  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Page(s):
    3221-3228

    In this paper, we propose an area-efficient design of Advanced Encryption Standard (AES) processor by applying a new common-expression-elimination (CSE) method to the sub-functions of various transformations required in AES. The proposed method reduces the area cost of realizing the sub-functions by extracting the common factors in the bit-level XOR/AND-based sum-of-product expressions of these sub-functions using a new CSE algorithm. Cell-based implementation results show that the AES processor with our proposed CSE method has significant area improvement compared with previous designs.

  • A Scan-Based Attack Based on Discriminators for AES Cryptosystems

    Ryuta NARA  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Page(s):
    3229-3237

    A scan chain is one of the most important testing techniques, but it can be used as side-channel attacks against a cryptography LSI. We focus on scan-based attacks, in which scan chains are targeted for side-channel attacks. The conventional scan-based attacks only consider the scan chain composed of only the registers in a cryptography circuit. However, a cryptography LSI usually uses many circuits such as memories, micro processors and other circuits. This means that the conventional attacks cannot be applied to the practical scan chain composed of various types of registers. In this paper, a scan-based attack which enables to decipher the secret key in an AES cryptography LSI composed of an AES circuit and other circuits is proposed. By focusing on bit pattern of the specific register and monitoring its change, our scan-based attack eliminates the influence of registers included in other circuits than AES. Our attack does not depend on scan chain architecture, and it can decipher practical AES cryptography LSIs.

  • A Two-Level Cache Design Space Exploration System for Embedded Applications

    Nobuaki TOJO  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Page(s):
    3238-3247

    Recently, two-level cache, L1 cache and L2 cache, is commonly used in a processor. Particularly in an embedded system whereby a single application or a class of applications is repeatedly executed on a processor, its cache configuration can be customized such that an optimal one is achieved. An optimal two-level cache configuration can be obtained which minimizes overall memory access time or memory energy consumption by varying the three cache parameters: the number of sets, a line size, and an associativity, for L1 cache and L2 cache. In this paper, we first extend the L1 cache simulation algorithm so that we can explore two-level cache configuration. Second, we propose two-level cache design space exploration algorithms: CRCB-T1 and CRCB-T2, each of which is based on applying Cache Inclusion Property to two-level cache configuration. Each of the proposed algorithms realizes exact cache simulation but decreases the number of cache hit/miss judgments by a factor of several thousands. Experimental results show that, by using our approach, the number of cache hit/miss judgments required to optimize a cache configurations is reduced to 1/50-1/5500 compared to the exhaustive approach. As a result, our proposed approach totally runs an average of 1398.25 times faster compared to the exhaustive approach. Our proposed cache simulation approach achieves the world fastest two-level cache design space exploration.

  • Entropy Decoding Processor for Modern Multimedia Applications

    Sumek WISAYATAKSIN  Dongju LI  Tsuyoshi ISSHIKI  Hiroaki KUNIEDA  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Page(s):
    3248-3257

    An entropy decoding engine plays an important role in modern multimedia decoders. Previous researches that focused on the decoding performance paid a considerable attention to only one parameter such as the data parsing speed, but they did not consider the performance caused by a table configuration time and memory size. In this paper, we developed a novel method of entropy decoding based on the two step group matching scheme. Our approach achieves the high performance on both data parsing speed and configuration time with small memory needed. We also deployed our decoding scheme to implement an entropy decoding processor, which performs operations based on normal processor instructions and VLD instructions for decoding variable length codes. Several extended VLD instructions are prepared to increase the bitstream parsing process in modern multimedia applications. This processor provides a solution with software flexibility and hardware high speed for stand-alone entropy decoding engines. The VLSI hardware is designed by the Language for Instruction Set Architecture (LISA) with 23 Kgates and 110 MHz maximum clock frequency under TSMC 0.18 µm technology. The experimental simulations revealed that proposed processor achieves the higher performance and suitable for many practical applications such as MPEG-2, MPEG-4, H.264/AVC and AAC.

  • Heuristic Instruction Scheduling Algorithm Using Available Distance for Partial Forwarding Processor

    Takuji HIEDA  Hiroaki TANAKA  Keishi SAKANUSHI  Yoshinori TAKEUCHI  Masaharu IMAI  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Page(s):
    3258-3267

    Partial forwarding is a design method to place forwarding paths on a part of processor pipeline. Hardware cost of processor can be reduced without performance loss by partial forwarding. However, compiler with the instruction scheduler which considers partial forwarding structure of the target processor is required since conventional scheduling algorithm cannot make the most of partial forwarding structure. In this paper, we propose a heuristic instruction scheduling method for processors with partial forwarding structure. The proposed algorithm uses available distance to schedule instructions which are suitable for the target partial forwarding processor. Experimental results show that the proposed method generates near-optimal solutions in practical time and some of the optimized codes for partial forwarding processor run in the shortest time among the target processors. It also shows that the proposed method is superior to hazard detection unit.

  • Efficient Cut Enumeration Heuristics for Depth-Optimum Technology Mapping for LUT-Based FPGAs

    Taiga TAKATA  Yusuke MATSUNAGA  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Page(s):
    3268-3275

    Recent technology mappers for LUT based FPGAs employ cut enumeration. Although many cuts are often needed to find a good network, enumerating all the cuts with large size consumes a lot of run-time. Existing algorithms employ the bottom-up merging which calculates Cartesian products of the fanins' cuts for each node. The number of cuts is much smaller than the size of the Cartesian products in most cases. Thus, the existing algorithms are inefficient. Furthermore, the number of cuts exponentially increases with the size of cuts, that makes the run-time much longer. Several algorithms to enumerate not all the cuts but partial cuts have been presented, but they tend to disturb the quality of networks. This paper presents two algorithms to enumerate cuts; an exhaustive enumeration and a partial enumeration. Both of them are efficient because they do not employ the bottom-up merging. The partial enumeration reduces the number of enumerated cuts with a guarantee that a depth-minimum network can be constructed. The experimental results show that the exhaustive enumeration runs about 5 and 13 times faster than the existing bottom-up algorithm for K=8, 9 respectively, while keeping the same results. On the other hand, the partial enumeration runs about 9 and 29 times faster than the existing algorithm for K = 8, 9, respectively. The average area of networks derived by the sets of cuts enumerated by the partial enumeration is only 4% larger than that derived with using all the cuts, and the depth is the same.

  • Special Section on Image Media Quality
  • FOREWORD

    Mitsuho YAMADA  

     
    FOREWORD

      Page(s):
    3276-3276
  • Two Principles of High-Level Human Visual Processing Potentially Useful for Image and Video Quality Assessment

    Shin'ya NISHIDA  

     
    INVITED PAPER

      Page(s):
    3277-3283

    Objective assessment of image and video quality should be based on a correct understanding of subjective assessment by human observers. Previous models have incorporated the mechanisms of early visual processing in image quality metrics, enabling us to evaluate the visibility of errors from the original images. However, to understand how human observers perceive image quality, one should also consider higher stages of visual processing where perception is established. In higher stages, the visual system presumably represents a visual scene as a collection of meaningful components such as objects and events. Our recent psychophysical studies suggest two principles related to this level of processing. First, the human visual system integrates shape and color signals along perceived motion trajectories in order to improve visibility of the shape and color of moving objects. Second, the human visual system estimates surface reflectance properties like glossiness using simple image statistics rather than by inverse computation of image formation optics. Although the underlying neural mechanisms are still under investigation, these computational principles are potentially useful for the development of effective image processing technologies and for quality assessment. Ideally, if a model can specify how a given image is transformed into high-level scene representations in the human brain, it would predict many aspects of subjective image quality, including fidelity and naturalness.

  • Video-Quality Estimation Based on Reduced-Reference Model Employing Activity-Difference

    Toru YAMADA  Yoshihiro MIYAMOTO  Yuzo SENDA  Masahiro SERIZAWA  

     
    PAPER-Evaluation

      Page(s):
    3284-3290

    This paper presents a Reduced-reference based video-quality estimation method suitable for individual end-user quality monitoring of IPTV services. With the proposed method, the activity values for individual given-size pixel blocks of an original video are transmitted to end-user terminals. At the end-user terminals, the video quality of a received video is estimated on the basis of the activity-difference between the original video and the received video. Psychovisual weightings and video-quality score adjustments for fatal degradations are applied to improve estimation accuracy. In addition, low-bit-rate transmission is achieved by using temporal sub-sampling and by transmitting only the lower six bits of each activity value. The proposed method achieves accurate video quality estimation using only low-bit-rate original video information (15 kbps for SDTV). The correlation coefficient between actual subjective video quality and estimated quality is 0.901 with 15 kbps side information. The proposed method does not need computationally demanding spatial and gain-and-offset registrations. Therefore, it is suitable for real-time video-quality monitoring in IPTV services.

  • Estimation of Mosquito Noise Level from Decoded Picture

    Kenji SUGIYAMA  Naoya SAGARA  Yohei KASHIMURA  

     
    PAPER-Evaluation

      Page(s):
    3291-3296

    With DCT coding, block artifact and mosquito noise degradations appear in decoded pictures. The control of post filtering is important to reduce degradations without causing side effects. Decoding information is useful, if the filter is inside or close to the encoder; however, it is difficult to control with independent post filtering, such as in a display. In this case, control requires the estimation of the artifact from only the decoded picture. In this work, we describe an estimation method that determines the mosquito noise block and level. In this method, the ratio of spatial activity is taken between the mosquito block and the neighboring flat block. We test the proposed method using the reconstructed pictures which are coded with different quantization scales. We recognize that the results are mostly reasonable with the different quantizations.

  • Non-intrusive Packet-Layer Model for Monitoring Video Quality of IPTV Services

    Kazuhisa YAMAGISHI  Takanori HAYASHI  

     
    PAPER-Evaluation

      Page(s):
    3297-3306

    Developing a non-intrusive packet-layer model is required to passively monitor the quality of experience (QoE) during service. We propose a packet-layer model that can be used to estimate the video quality of IPTV using quality parameters derived from transmitted packet headers. The computational load of the model is lighter than that of the model that takes video signals and/or video-related bitstream information such as motion vectors as input. This model is applicable even if the transmitted bitstream information is encrypted because it uses transmitted packet headers rather than bitstream information. For developing the model, we conducted three extensive subjective quality assessments for different encoders and decoders (codecs), and video content. Then, we modeled the subjective video quality assessment characteristics based on objective features affected by coding and packet loss. Finally, we verified the model's validity by applying our model to unknown data sets different from training data sets used above.

  • Objective Evaluation of Components of Colour Distortions due to Image Compression

    Amal PUNCHIHEWA  Jonathan ARMSTRONG  Seiichiro HANGAI  Takayuki HAMAMOTO  

     
    PAPER-Evaluation

      Page(s):
    3307-3312

    This paper presents a novel approach of analysing colour bleeding caused by image compression. This is achieved by isolating two components of colour bleeding, and evaluating these components separately. Although these specific components of colour bleeding have not been studied with great detail in the past, with the use of a synthetic test pattern -- similar to the colour bars used to test analogue television transmissions -- we have successfully isolated, and evaluated: "colour blur" and "colour ringing," as two separate components of colour bleeding artefact. We have also developed metrics for these artefacts, and tested these derived metrics in a series of trials aimed to test the colour reproduction performance of a JPEG codec, and a JPEG2000 codec -- both implemented by the developer IrfanView. The algorithms developed to measure these artefact metrics proved to be effective tools for evaluating and benchmarking the performance of similar codecs, or different implementations of the same codecs.

  • Detection and Classification of Invariant Blurs

    Rachel Mabanag CHONG  Toshihisa TANAKA  

     
    PAPER-Imaging

      Page(s):
    3313-3320

    A new algorithm for simultaneously detecting and identifying invariant blurs is proposed. This is mainly based on the behavior of extrema values in an image. It is computationally simple and fast thereby making it suitable for preprocessing especially in practical imaging applications. Benefits of employing this method includes the elimination of unnecessary processes since unblurred images will be separated from the blurred ones which require deconvolution. Additionally, it can improve reconstruction performance by proper identification of blur type so that a more effective blur specific deconvolution algorithm can be applied. Experimental results on natural images and its synthetically blurred versions show the characteristics and validity of the proposed method. Furthermore, it can be observed that feature selection makes the method more efficient and effective.

  • The Effects of Sensor Spectral Sensitivity, Pixel Pitch, Photon Shot Noise, and Dark Noise on Perceived Image Quality

    Hideyasu KUNIBA  Roy S. BERNS  

     
    PAPER-Imaging

      Page(s):
    3321-3327

    Image sensor noise was estimated in an approximately perceptually uniform space with a color image sensor model. Particularly, the noise level with respect to an image sensor's pixel pitch and the dark noise was investigated. It was shown that the noise level could be about half when spectral sensitivity was optimized considering noise with reduced color reproduction accuracy. It was also shown that for a 2.0 µm pixel pitch sensor, the exposure index should be less than 100-150 in order to keep the noise level 94 less than 5 even if it had no dark noise, whereas the exposure index could reach about 2000-4000 for a 8.0 µm pixel pitch sensor depending on the sensor sensitivity and the dark noise level.

  • A Simple Method to Measure MTF of Paper and Its Application for Dot Gain Analysis

    Masayuki UKISHIMA  Hitomi KANEKO  Toshiya NAKAGUCHI  Norimichi TSUMURA  Markku HAUTA-KASARI  Jussi PARKKINEN  Yoichi MIYAKE  

     
    PAPER-Printing

      Page(s):
    3328-3335

    Image quality of halftone print is significantly influenced by optical characteristics of paper. Light scattering in paper produces optical dot gain, which has a significant influence on the tone and color reproductions of halftone print. The light scattering can be quantified by the Modulation Transfer Function (MTF) of paper. Several methods have been proposed to measure the MTF of paper. However, these methods have problems in efficiency or accuracy in the measurement. In this article, a new method is proposed to measure the MTF of paper efficiently and accurately, and the dot gain effect on halftone print is analyzed. The MTF is calculated from the ratio in spatial frequency domain between the responses of incident pencil light to paper and the perfect specular reflector. Since the spatial frequency characteristic of input pencil light can be obtained from the response of perfect specular reflector, it does not need to produce the input illuminant having "ideal" impulse characteristic. Our method is experimentally efficient since only two images need to be measured. Besides it can measure accurately since the data can be approximated by the conventional MTF model. Next, we predict the reflectance distribution of halftone print using the measured MTF in microscopy in order to analyze the dot gain effect since it can clearly be observed in halftone micro-structure. Finally, a simulation is carried out to remove the light scattering effect from the predicted image. Since the simulated image is not affected by the optical dot gain, it can be applied to analyze the real dot coverage.

  • Face Alignment Based on Statistical Models Using SIFT Descriptors

    Zisheng LI  Jun-ichi IMAI  Masahide KANEKO  

     
    PAPER-Processing

      Page(s):
    3336-3343

    Active Shape Model (ASM) is a powerful statistical tool for image interpretation, especially in face alignment. In the standard ASM, local appearances are described by intensity profiles, and the model parameter estimation is based on the assumption that the profiles follow a Gaussian distribution. It suffers from variations of poses, illumination, expressions and obstacles. In this paper, an improved ASM framework, GentleBoost based SIFT-ASM is proposed. Local appearances of landmarks are originally represented by SIFT (Scale-Invariant Feature Transform) descriptors, which are gradient orientation histograms based representations of image neighborhood. They can provide more robust and accurate guidance for search than grey-level profiles. Moreover, GentleBoost classifiers are applied to model and search the SIFT features instead of the unnecessary assumption of Gaussian distribution. Experimental results show that SIFT-ASM significantly outperforms the original ASM in aligning and localizing facial features.

  • Image Restoration Based on Adaptive Directional Regularization

    Osama AHMED OMER  Toshihisa TANAKA  

     
    PAPER-Processing

      Page(s):
    3344-3354

    This paper addresses problems appearing in restoration algorithms based on utilizing both Tikhonov and bilateral total variation (BTV) regularization. The former regularization assumes that prior information has Gaussian distribution which indeed fails at edges, while the later regularization highly depends on the selected bilateral filter's parameters. To overcome these problems, we propose a locally adaptive regularization. In the proposed algorithm, we use general directional regularization functions with adaptive weights. The adaptive weights are estimated from local patches based on the property of the partially restored image. Unlike Tikhonov regularization, it can avoid smoothness across edges by using adaptive weights. In addition, unlike BTV regularization, the proposed regularization function doesn't depend on parameters' selection. The convexity conditions as well as the convergence conditions are derived for the proposed algorithm.

  • An Improved Method to CABAC in the H.264/AVC Video Compression Standard

    LeThanh HA  Chun-Su PARK  Seung-Won JUNG  Sung-Jea KO  

     
    PAPER-Coding

      Page(s):
    3355-3360

    Context-based Adaptive Binary Arithmetic Coding (CA-BAC) is adopted as an entropy coding tool for main profile of the video coding standard H.264/AVC. CABAC achieves higher degree of redundancy reduction by estimating the conditional probability of each binary symbol which is the input to the arithmetic coder. This paper presents an entropy coding method based on CABAC. In the proposed method, the binary symbol is coded using more precisely estimated conditional probability, thereby leading to performance improvement. We apply our method to the standard and evaluate its performance for different video sources and various quantization parameters (QP). Experiment results show that our method outperforms the original CABAC in term of coding efficiency, and the average bit-rate savings are up to 1.2%.

  • Macroblock and Motion Feature Analysis to H.264/AVC Fast Inter Mode Decision

    Yiqing HUANG  Qin LIU  Shuijiong WU  Zhewen ZHENG  Takeshi IKENAGA  

     
    PAPER-Coding

      Page(s):
    3361-3368

    One fast inter mode decision algorithm is proposed in this paper. The whole algorithm is divided into two stages. In the pre-stage, by exploiting spatial and temporal information of encoded macrobocks (MBs), a skip mode early detection scheme is proposed. The homogeneity of current MB is also analyzed to filter out small inter modes in this stage. Secondly, during the block matching stage, a motion feature based inter mode decision scheme is introduced by analyzing the motion vector predictor's accuracy, the block overlapping situation and the smoothness of SAD (sum of absolute difference) value. Moreover, the rate distortion cost is checked in an early stage and we set some constraints to speed up the whole decision flow. Experiments show that our algorithm can achieve a speed up factor of up to 53.4% for sequences with different motion type. The overall bit increment and quality degradation is negligible compared with existing works.

  • Improved Vector Quantization Based Block Truncation Coding Using Template Matching and Lloyd Quantization

    Seung-Won JUNG  Yeo-Jin YOON  Hyeong-Min NAM  Sung-Jea KO  

     
    LETTER-Coding

      Page(s):
    3369-3371

    Block truncation coding (BTC) is an efficient image compression algorithm that generates a constant output bit-rate. For color image compression, vector quantization (VQ) is exploited to improve the coding efficiency. In this letter, we propose an improved VQ based BTC (VQ-BTC) algorithm using template matching and Lloyd quantization (LQ). The experimental results show that the proposed method improves the PSNR by 0.9 dB in average compared to the conventional VQ-BTC algorithms.

  • Adaptive Ambient Illumination Based on Color Harmony Model

    Ayano KIKUCHI  Keita HIRAI  Toshiya NAKAGUCHI  Norimichi TSUMURA  Yoichi MIYAKE  

     
    LETTER-Color

      Page(s):
    3372-3375

    We investigated the relationship between ambient illumination and psychological effect by applying a modified color harmony model. We verified the proposed model by analyzing correlation between psychological value and modified color harmony score. Experimental results showed the possibility to obtain the best color for illumination using this model.

  • Regular Section
  • Boundary Implications for Stability Analysis of a Class of Uncertain Linear Time-Delay Systems by the Lambert W Function

    Hiroshi SHINOZAKI  Takehiro MORI  

     
    PAPER-Systems and Control

      Page(s):
    3376-3380

    The purpose of the paper is to show that boundary implication results hold for complex-valued uncertain linear time-delay systems. The results are derived by the Lambert W function and yield tractable robust stability criteria for simultaneously triangularizable linear time-delay systems. The setting is similar to a recently reported extreme-point result, but the assumed uncertainty sets can be much more free in shape.

  • Ultra Low Power Delay Element with Post-Chip Adjustable Ability

    Jung-Lin YANG  Chih-Wei CHAO  

     
    PAPER-VLSI Design Technology and CAD

      Page(s):
    3381-3389

    Our paper proposes a low power delay element with many other valuable characteristics for asynchronous circuits in the bundled-data implementation. Delay elements are frequently utilized to interact with asynchronous environment for revealing the current status of the bundled-data asynchronous circuits. Thus, a notable portion of the total energy is consumed by the delay elements for this kind of designs. Moreover, constructing a specific delay on a chip is a difficult task for recent CMOS technology. An extreme low power asymmetrical delay element with post-chip adjustment feature was developed mainly for solving these issues. Our initial intention was to develop a programmable delay element for asynchronous data path components. The proposed delay element is also suitable for many other applications requiring low power constraint. In addition to the programmability, the delay element also demonstrated efficiently characteristics such as good tolerance to process and temperature variations on the delay. Our delay element is equivalent to approximately the average power of a 4-stage inverter chain. A large delay can be obtained by cascaded scheme with nearly zero handshaking overhead. All arguments were cautiously verified by the post-layout simulation setup using TSMC 0.35 µm and 0.18 µm technologies under all extreme corners.

  • Deadbeat Control for Linear Systems with Input Constraints

    Dane BAANG  Dongkyoung CHWA  

     
    LETTER-Systems and Control

      Page(s):
    3390-3393

    A new deadbeat control scheme for linear systems with input constraints is presented. Input constraints exist in most control systems, but in conventional dead-beat control, logical strategy to handle it has not been studied enough. The proposed controller in this paper adjusts the number of steps for dead-beat tracking on-line, in order to achieve delayed deadbeat-tracking performance and satisfy any admissible input constraint. Increasing the number of steps for dead-beat tracking and formulating the corresponding degree of freedom into null-space vectors make it possible to obtain delayed dead-beat tracking, and minimize the inevitable delay, respectively. LMI feasibility problems are solved to numerically obtain the solution and minimize the unavoidable step-delay. As a result, calculation effort is reduced compared to LMI-optimization problem. The proposed schemes can be readily numerically implemented. Its practical usefulness is validated by simulation for 6-axis robot model and experimental results for DC-motor servoing.

  • Performance Analysis of Complex CDMA Using Complex Chaotic Spreading Sequence with Constant Power

    Ryo TAKAHASHI  Ken UMENO  

     
    LETTER-Nonlinear Problems

      Page(s):
    3394-3397

    A performance of the complex chaotic spreading sequences with constant power is investigated in a chip-synchronous complex CDMA with a complex scrambling. We estimate a signal-to-interference ratio (SIR) and a bit error rate (BER). An exact invariant measure of the complex chaotic spreading sequence can be obtained. Therefore, the SIR can be calculated analytically. The result can be used as one of the criteria for evaluating the performance of the complex CDMA using the chaotic spreading sequences.

  • A Simple Canonical Code for Fullerene Graphs

    Naoki SHIMOTSUMA  Shin-ichi NAKANO  

     
    LETTER-Algorithms and Data Structures

      Page(s):
    3398-3400

    In this paper we give a simple algorithm to compute a canonical code for fullerene graphs. Our algorithm runs in O(n) time, while the best known algorithm runs in O(n2) time. Our algorithm is simple. One can generalize the algorithm to compute a canonical code for the skeleton of a convex polyhedron with n vertices. The algorithm runs in O(n2) time.

  • Hash Functions and Information Theoretic Security

    Nasour BAGHERI  Lars R. KNUDSEN  Majid NADERI  Sφren S. THOMSEN  

     
    LETTER-Cryptography and Information Security

      Page(s):
    3401-3403

    Information theoretic security is an important security notion in cryptography as it provides a true lower bound for attack complexities. However, in practice attacks often have a higher cost than the information theoretic bound. In this paper we study the relationship between information theoretic attack costs and real costs. We show that in the information theoretic model, many well-known and commonly used hash functions such as MD5 and SHA-256 fail to be preimage resistant.

  • Constructions of Factorizable Multilevel Hadamard Matrices

    Shinya MATSUFUJI  Pingzhi FAN  

     
    LETTER-Spread Spectrum Technologies and Applications

      Page(s):
    3404-3406

    Factorization of Hadamard matrices can provide fast algorithm and facilitate efficient hardware realization. In this letter, constructions of factorizable multilevel Hadamard matrices, which can be considered as special case of unitary matrices, are inverstigated. In particular, a class of ternary Hadamard matrices, together with its application, is presented.

  • Fast Mode Decision Using Global Disparity Vector for Multiview Video Coding

    Dong-Hoon HAN  Yung-Ki LEE  Yung-Lyul LEE  

     
    LETTER-Image

      Page(s):
    3407-3411

    Since multiview video coding (MVC) based on H.264/AVC uses a prediction scheme exploiting inter-view correlation among multiview video, MVC encoder compresses multiple views more efficiently than simulcast H.264/AVC encoder. However, in case that the number of views to be encoded increases in MVC, the total encoding time will be greatly increased. To reduce computational complexity in MVC, a fast mode decision using both Macroblock-based region segmentation information and global disparity vector among views is proposed to reduce the encoding time. The proposed method achieves on the average 1.5 2.9 reduction of the total encoding time with the PSNR (Peak Signal-to-Noise Ratio) degradation of about 0.05 dB.