The search functionality is under construction.

Author Search Result

[Author] Masanori HASHIMOTO(67hit)

21-40hit(67hit)

  • Measurement Circuits for Acquiring SET Pulse Width Distribution with Sub-FO1-Inverter-Delay Resolution

    Ryo HARADA  Yukio MITSUYAMA  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E93-A No:12
      Page(s):
    2417-2423

    This paper presents two circuits to measure pulse width distribution of single event transients (SETs). We first review requirements for SET measurement in accelerated neutron radiation test and point out problems of previous works, in terms of time resolution, time/area efficiency for obtaining large samples and certainty in absolute values of pulse width. We then devise two measurement circuits and a pulse generator circuit that satisfy all the requirements and attain sub-FO1-inverter-delay resolution, and propose a measurement procedure for assuring the absolute width values. Operation of one of the proposed circuits was confirmed by a radiation experiment of alpha particles with a fabricated test chip.

  • Vulnerability Estimation of DNN Model Parameters with Few Fault Injections

    Yangchao ZHANG  Hiroaki ITSUJI  Takumi UEZONO  Tadanobu TOBA  Masanori HASHIMOTO  

     
    PAPER

      Pubricized:
    2022/11/09
      Vol:
    E106-A No:3
      Page(s):
    523-531

    The reliability of deep neural networks (DNN) against hardware errors is essential as DNNs are increasingly employed in safety-critical applications such as automatic driving. Transient errors in memory, such as radiation-induced soft error, may propagate through the inference computation, resulting in unexpected output, which can adversely trigger catastrophic system failures. As a first step to tackle this problem, this paper proposes constructing a vulnerability model (VM) with a small number of fault injections to identify vulnerable model parameters in DNN. We reduce the number of bit locations for fault injection significantly and develop a flow to incrementally collect the training data, i.e., the fault injection results, for VM accuracy improvement. We enumerate key features (KF) that characterize the vulnerability of the parameters and use KF and the collected training data to construct VM. Experimental results show that VM can estimate vulnerabilities of all DNN model parameters only with 1/3490 computations compared with traditional fault injection-based vulnerability estimation.

  • An Approach for Reducing Leakage Current Variation due to Manufacturing Variability

    Tsuyoshi SAKATA  Takaaki OKUMURA  Atsushi KUROKAWA  Hidenari NAKASHIMA  Hiroo MASUDA  Takashi SATO  Masanori HASHIMOTO  Koutaro HACHIYA  Katsuhiro FURUKAWA  Masakazu TANAKA  Hiroshi TAKAFUJI  Toshiki KANAMOTO  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E92-A No:12
      Page(s):
    3016-3023

    Leakage current is an important qualitative metric of LSI (Large Scale Integrated circuit). In this paper, we focus on reduction of leakage current variation under the process variation. Firstly, we derive a set of quadratic equations to evaluate delay and leakage current under the process variation. Using these equations, we discuss the cases of varying leakage current without degrading delay distribution and propose a procedure to reduce the leakage current variations. From the experiments, we show the proposed method effectively reduces the leakage current variation up to 50% at 90 percentile point of the distribution compared with the conventional design approach.

  • Experimental Study on Cell-Base High-Performance Datapath Design

    Masanori HASHIMOTO  Yoshiteru HAYASHI  Hidetoshi ONODERA  

     
    LETTER-IP Design

      Vol:
    E86-A No:12
      Page(s):
    3204-3207

    This paper experimentally investigates the effectiveness of regularly-placed bit-slice layout and transistor-level optimization to datapath circuit performance. We focus on cell-base design flows with transistor-level circuit optimization. We examine the effectiveness through design experiments of 32-bit carry select adder and 16-bit tree-style multiplier in a 0.35 µm technology. From the experimental results, we can scarcely observe that manual cell placement contributes to improve circuit performance. On the other hand, transistor-level circuit optimization is so effective that circuit delay is reduced by 11-20% and power dissipation decreases to 42-62%. We can see that, in the case of cell-base design, transistor-level optimization is also important as well as in the case of custom design, whereas cell-base bit-slice layout has less importance to circuit performance.

  • Field Slack Assessment for Predictive Fault Avoidance on Coarse-Grained Reconfigurable Devices

    Toshihiro KAMEDA  Hiroaki KONOURA  Dawood ALNAJJAR  Yukio MITSUYAMA  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Test and Verification

      Vol:
    E96-D No:8
      Page(s):
    1624-1631

    This paper proposes a procedure for avoiding delay faults in field with slack assessment during standby time. The proposed procedure performs path delay testing and checks if the slack is larger than a threshold value using selectable delay embedded in basic elements (BE). If the slack is smaller than the threshold, a pair of BEs to be replaced, which maximizes the path slack, is identified. Experimental results with two application circuits mapped on a coarse-grained architecture show that for aging-induced delay degradation a small threshold slack, which is less than 1 ps in a test case, is enough to ensure the delay fault prediction.

  • A Sampling Switch Design Procedure for Active Matrix Liquid Crystal Displays

    Shingo TAKAHASHI  Shuji TSUKIYAMA  Masanori HASHIMOTO  Isao SHIRAKAWA  

     
    PAPER-Circuit Synthesis

      Vol:
    E89-A No:12
      Page(s):
    3538-3545

    In the design of an active matrix LCD (Liquid Crystal Display), the ratio of the pixel voltage to the video voltage (RPV) of a pixel is an important factor of the performance of the LCD, since the pixel voltage of each pixel determines its transmitted luminance. Thus, of practical importance is the issue of how to maintain the admissible allowance of RPV of each pixel within a prescribed narrow range. This constraint on RPV is analyzed in terms of circuit parameters associated with the sampling switch and sampling pulse of a column driver in the LCD. With the use of a minimal set of such circuit parameters, a design procedure is described dedicatedly for the sampling switch, which intends to seek an optimal sampling switch as well as an optimal sampling pulse waveform. A number of experimental results show that an optimal sampling switch attained by the proposed procedure yields a source driver with almost 18% less power consumption than the one by manual design. Moreover, the percentage of the RPVs within 1001% among 270 cases of fluctuations is 88.1% for the optimal sampling switch, but 46.7% for the manual design.

  • Transistor Sizing of LCD Driver Circuit for Technology Migration

    Masanori HASHIMOTO  Takahito IJICHI  Shingo TAKAHASHI  Shuji TSUKIYAMA  Isao SHIRAKAWA  

     
    LETTER-Circuit Synthesis

      Vol:
    E90-A No:12
      Page(s):
    2712-2717

    Design automation of LCD driver circuits is not sophisticatedly established. Display fineness of an LCD panel depends on a performance metric, ratio of pixel voltage to video voltage (RPV). However, there are several other important metrics, such as area, and the best circuit cannot be decided uniquely. This paper proposes a design automation technique for a LCD column driver to provide several circuit design results with different performance so that designers can select an appropriate design among them. The proposed technique is evaluated with an actual design data, and experimental results show that the proposed method successfully performs technology migration by transistor sizing. Also, the proposed technique is experimentally verified from points of solution quality and computational time.

  • On-Chip Thermal Gradient Analysis and Temperature Flattening for SoC Design

    Takashi SATO  Junji ICHIMIYA  Nobuto ONO  Koutaro HACHIYA  Masanori HASHIMOTO  

     
    PAPER-Prediction and Analysis

      Vol:
    E88-A No:12
      Page(s):
    3382-3389

    This paper quantitatively analyzes thermal gradient of SoC and proposes a thermal flattening procedure. First, the impact of dominant parameters, such as area occupancy of memory/logic block, power density, and floorplan on thermal gradient are studied quantitatively. Temperature difference is also evaluated from timing and reliability standpoints. Important results obtained here are 1) the maximum temperature difference increases with higher memory area occupancy and 2) the difference is very floorplan sensitive. Then, we propose a procedure to amend thermal gradient. A slight floorplan modification using the proposed procedure improves on-chip thermal gradient significantly.

  • Accuracy Enhancement of Grid-Based SSTA by Coefficient Interpolation

    Shinyu NINOMIYA  Masanori HASHIMOTO  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E93-A No:12
      Page(s):
    2441-2446

    Statistical timing analysis for manufacturing variability requires modeling of spatially-correlated variation. Common grid-based modeling for spatially-correlated variability involves a trade-off between accuracy and computational cost, especially for PCA (principal component analysis). This paper proposes to spatially interpolate variation coefficients for improving accuracy instead of fining spatial grids. Experimental results show that the spatial interpolation realizes a continuous expression of spatial correlation, and reduces the maximum error of timing estimates that originates from sparse spatial grids For attaining the same accuracy, the proposed interpolation reduced CPU time for PCA by 97.7% in a test case.

  • A Hardware Efficient Reservoir Computing System Using Cellular Automata and Ensemble Bloom Filter

    Dehua LIANG  Jun SHIOMI  Noriyuki MIURA  Masanori HASHIMOTO  Hiromitsu AWANO  

     
    PAPER-Computer System

      Pubricized:
    2022/04/08
      Vol:
    E105-D No:7
      Page(s):
    1273-1282

    Reservoir computing (RC) is an attractive alternative to machine learning models owing to its computationally inexpensive training process and simplicity. In this work, we propose EnsembleBloomCA, which utilizes cellular automata (CA) and an ensemble Bloom filter to organize an RC system. In contrast to most existing RC systems, EnsembleBloomCA eliminates all floating-point calculation and integer multiplication. EnsembleBloomCA adopts CA as the reservoir in the RC system because it can be implemented using only binary operations and is thus energy efficient. The rich pattern dynamics created by CA can map the original input into a high-dimensional space and provide more features for the classifier. Utilizing an ensemble Bloom filter as the classifier, the features provided by the reservoir can be effectively memorized. Our experiment revealed that applying the ensemble mechanism to the Bloom filter resulted in a significant reduction in memory cost during the inference phase. In comparison with Bloom WiSARD, one of the state-of-the-art reference work, the EnsembleBloomCA model achieves a 43× reduction in memory cost while maintaining the same accuracy. Our hardware implementation also demonstrated that EnsembleBloomCA achieved over 23× and 8.5× reductions in area and power, respectively.

  • Jitter Amplifier for Oscillator-Based True Random Number Generator

    Takehiko AMAKI  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Cryptography and Information Security

      Vol:
    E96-A No:3
      Page(s):
    684-696

    We propose a jitter amplifier architecture for an oscillator-based true random number generator (TRNG). Two types of latency-controllable (LC) buffer, which are the key components of the proposed jitter amplifier, are presented. We derive an equation to estimate the gain of the jitter amplifier, and analyze sufficient conditions for the proposed circuit to work properly. The proposed jitter amplifier was fabricated with a 65 nm CMOS process. The jitter amplifier with the two-voltage LC buffer occupied 3,300 µm2 and attained 8.4x gain, and that with the single-voltage LC buffer achieved 2.2x gain with an 1,700 µm2 area. The jitter amplification of the sampling clock increased the entropy of a bit stream and improved the results of the NIST test suite so that all the tests passed whereas TRNGs with simple correctors failed. The jitter amplifier attained higher throughput per area than a frequency divider when the required amount of jitter was more than two times larger than the inherent jitter in our test-chip implementations.

  • Reliability-Configurable Mixed-Grained Reconfigurable Array Supporting C-Based Design and Its Irradiation Testing

    Hiroaki KONOURA  Dawood ALNAJJAR  Yukio MITSUYAMA  Hajime SHIMADA  Kazutoshi KOBAYASHI  Hiroyuki KANBARA  Hiroyuki OCHI  Takashi IMAGAWA  Kazutoshi WAKABAYASHI  Masanori HASHIMOTO  Takao ONOYE  Hidetoshi ONODERA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E97-A No:12
      Page(s):
    2518-2529

    This paper proposes a mixed-grained reconfigurable architecture consisting of fine-grained and coarse-grained fabrics, each of which can be configured for different levels of reliability depending on the reliability requirement of target applications, e.g. mission-critical applications to consumer products. Thanks to the fine-grained fabrics, the architecture can accommodate a state machine, which is indispensable for exploiting C-based behavioral synthesis to trade latency with resource usage through multi-step processing using dynamic reconfiguration. In implementing the architecture, the strategy of dynamic reconfiguration, the assignment of configuration storage and the number of implementable states are key factors that determine the achievable trade-off between used silicon area and latency. We thus split the configuration bits into two classes; state-wise configuration bits and state-invariant configuration bits for minimizing area overhead of configuration bit storage. Through a case study, we experimentally explore the appropriate number of implementable states. A proof-of-concept VLSI chip was fabricated in 65nm process. Measurement results show that applications on the chip can be working in a harsh radiation environment. Irradiation tests also show the correlation between the number of sensitive bits and the mean time to failure. Furthermore, the temporal error rate of an example application due to soft errors in the datapath was measured and demonstrated for reliability-aware mapping.

  • Representative Frequency for Interconnect R(f)L(f)C Extraction

    Akira TSUCHIYA  Masanori HASHIMOTO  Hidetoshi ONODERA  

     
    PAPER-Parasitics and Noise

      Vol:
    E86-A No:12
      Page(s):
    2942-2951

    This paper discusses the frequency to extract RLC values from interconnects. In circuit design, frequency-independent equivalent circuit is widely used, and many design and analysis techniques based on this equivalent circuit are proposed so far. However in reality, characteristics of interconnects are frequency-dependent. Also pulse waveforms in digital circuits contain multiple frequency components. The frequency used for RLC extraction affects the accuracy of interconnect characterization, and hence careful determination of extraction frequency is critical. We propose a representative frequency for RLC extraction. Conventionally, representative frequencies are determined by input pulse. The proposed method decides the representative frequency based on the interconnect length, whereas conventional representative frequencies are determined by input pulse shape, period and patterns. We verify that the extraction at the proposed frequency provides the most accurate transition waveform against various input signals and interconnect structures in digital circuits.

  • Second-Order Polynomial Expressions for On-Chip Interconnect Capacitance

    Atsushi KUROKAWA  Masanori HASHIMOTO  Akira KASEBE  Zhangcai HUANG  Yun YANG  Yasuaki INOUE  Ryosuke INAGAKI  Hiroo MASUDA  

     
    PAPER-Interconnect

      Vol:
    E88-A No:12
      Page(s):
    3453-3462

    Simple closed-form expressions for efficiently calculating on-chip interconnect capacitances are presented. The formulas are expressed with second-order polynomial functions which do not include exponential functions. The runtime of the proposed formulas is about 2-10 times faster than those of existing formulas. The root mean square (RMS) errors of the proposed formulas are within 1.5%, 1.3%, 3.1%, and 4.6% of the results obtained by a field solver for structures with one line above a ground plane, one line between ground planes, three lines above a ground plane, and three lines between ground planes, respectively. The proposed formulas are also superior in accuracy to existing formulas.

  • Activation-Aware Slack Assignment Based Mode-Wise Voltage Scaling for Energy Minimization

    TaiYu CHENG  Yutaka MASUDA  Jun NAGAYAMA  Yoichi MOMIYAMA  Jun CHEN  Masanori HASHIMOTO  

     
    PAPER

      Pubricized:
    2021/08/31
      Vol:
    E105-A No:3
      Page(s):
    497-508

    Reducing power consumption is a crucial factor making industrial designs, such as mobile SoCs, competitive. Voltage scaling (VS) is the classical yet most effective technique that contributes to quadratic power reduction. A recent design technique called activation-aware slack assignment (ASA) enhances the voltage-scaling by allocating the timing margin of critical paths with a stochastic mean-time-to-failure (MTTF) analysis. Meanwhile, such stochastic treatment of timing errors is accepted in limited application domains, such as image processing. This paper proposes a design optimization methodology that achieves a mode-wise voltage-scalable (MWVS) design guaranteeing no timing errors in each mode operation. This work formulates the MWVS design as an optimization problem that minimizes the overall power consumption considering each mode duration, achievable voltage lowering and accompanied circuit overhead explicitly, and explores the solution space with the downhill simplex algorithm that does not require numerical derivation and frequent objective function evaluations. For obtaining a solution, i.e., a design, in the optimization process, we exploit the multi-corner multi-mode design flow in a commercial tool for performing mode-wise ASA with sets of false paths dedicated to individual modes. We applied the proposed design methodology to RISC-V design. Experimental results show that the proposed methodology saves 13% to 20% more power compared to the conventional VS approach and attains 8% to 15% gain from the conventional single-mode ASA. We also found that cycle-by-cycle fine-grained false path identification reduced leakage power by 31% to 42%.

  • A Performance Optimization Method by Gate Resizing Based on Statistical Static Timing Analysis

    Masanori HASHIMOTO  Hidetoshi ONODERA  

     
    PAPER-Performance Optimization

      Vol:
    E83-A No:12
      Page(s):
    2558-2568

    This paper discusses a gate resizing method for performance enhancement based on statistical static timing analysis. The proposed method focuses on timing uncertainties caused by local random fluctuation. Our method aims to remove both over-design and under-design of a circuit, and realize high-performance and high-reliability LSI design. The effectiveness of our method is examined by 6 benchmark circuits. We verify that our method can reduce the delay time further from the circuits optimized for minimizing the delay without the consideration of delay fluctuation.

  • Optimal Termination of On-Chip Transmission-Lines for High-Speed Signaling

    Akira TSUCHIYA  Masanori HASHIMOTO  Hidetoshi ONODERA  

     
    PAPER

      Vol:
    E90-C No:6
      Page(s):
    1267-1273

    This paper discusses the resistive termination of on-chip high-performance interconnects. Resistive termination is effective to improve the bandwidth of on-chip interconnects, on the other hands, increases the power dissipation and the area. Therefore trade-off analysis about resistive termination is necessary. This paper proposes a method to determine the termination of on-chip interconnects. The termination derived by the proposed method provides minimum sensitivity to process variation as well as maximum eye-opening in voltage.

  • Signal-Dependent Analog-to-Digital Conversion Based on MINIMAX Sampling

    Igors HOMJAKOVS  Masanori HASHIMOTO  Tetsuya HIROSE  Takao ONOYE  

     
    PAPER

      Vol:
    E96-A No:2
      Page(s):
    459-468

    This paper presents an architecture of signal-dependent analog-to-digital converter (ADC) based on MINIMAX sampling scheme that allows achieving high data compression rate and power reduction. The proposed architecture consists of a conventional synchronous ADC, a timer and a peak detector. AD conversion is carried out only when input signal peaks are detected. To improve the accuracy of signal reconstruction, MINIMAX sampling is improved so that multiple points are captured for each peak, and its effectiveness is experimentally confirmed. In addition, power reduction, which is the primary advantage of the proposed signal-dependent ADC, is analytically discussed and then validated with circuit simulations.

  • Statistical Timing Analysis Considering Clock Jitter and Skew due to Power Supply Noise and Process Variation

    Takashi ENAMI  Shinyu NINOMIYA  Ken-ichi SHINKAI  Shinya ABE  Masanori HASHIMOTO  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E93-A No:12
      Page(s):
    2399-2408

    Clock driver suffers from delay variation due to manufacturing and environmental variabilities as well as combinational cells. The delay variation causes clock skew and jitter, and varies both setup and hold timing margins. This paper presents a timing verification method that takes into consideration delay variation inside a clock network due to both manufacturing variability and dynamic power supply noise. We also discuss that setup and hold slack computation inherently involves a structural correlation problem due to common paths, and demonstrate that assigning individual random variables to upstream clock drivers provides a notable accuracy improvement in clock skew estimation with limited increase in computational cost. We applied the proposed method to industrial designs in 90 nm process. Experimental results show that dynamic delay variation reduces setup slack by over 500 ps and hold slack by 16.4 ps in test cases.

  • Power Gating Implementation for Supply Noise Mitigation with Body-Tied Triple-Well Structure

    Yasumichi TAKAI  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Circuit Design

      Vol:
    E95-A No:12
      Page(s):
    2220-2225

    This paper investigates power gating implementations that mitigate power supply noise. We focus on the body connection of power-gated circuits, and examine the amount of power supply noise induced by power-on rush current and the contribution of a power-gated circuit as a decoupling capacitance during the sleep mode. To figure out the best implementation, we designed and fabricated a test chip in 65 nm process. Experimental results with measurement and simulation reveal that the power-gated circuit with body-tied structure in triple-well is the best implementation from the following three points; power supply noise due to rush current, the contribution of decoupling capacitance during the sleep mode and the leakage reduction thanks to power gating.

21-40hit(67hit)