The search functionality is under construction.

Author Search Result

[Author] Youhua SHI(16hit)

1-16hit
  • An Effective Suspicious Timing-Error Prediction Circuit Insertion Algorithm Minimizing Area Overhead

    Shinnosuke YOSHIDA  Youhua SHI  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E98-A No:7
      Page(s):
    1406-1418

    As process technologies advance, timing-error correction techniques have become important as well. A suspicious timing-error prediction (STEP) technique has been proposed recently, which predicts timing errors by monitoring the middle points, or check points of several speed-paths in a circuit. However, if we insert STEP circuits (STEPCs) in the middle points of all the paths from primary inputs to primary outputs, we need many STEPCs and thus require too much area overhead. How to determine these check points is very important. In this paper, we propose an effective STEPC insertion algorithm minimizing area overhead. Our proposed algorithm moves the STEPC insertion positions to minimize inserted STEPC counts. We apply a max-flow and min-cut approach to determine the optimal positions of inserted STEPCs and reduce the required number of STEPCs to 1/10-1/80 and their area to 1/5-1/8 compared with a naive algorithm. Furthermore, our algorithm realizes 1.12X-1.5X overclocking compared with just inserting STEPCs into several speed-paths.

  • Scan-Based Attack on AES through Round Registers and Its Countermeasure

    Youhua SHI  Nozomu TOGAWA  Masao YANAGISAWA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E95-A No:12
      Page(s):
    2338-2346

    Scan-based side channel attack on hardware implementations of cryptographic algorithms has shown its great security threat. Unlike existing scan-based attacks, in our work we observed that instead of the secret-related-registers, some non-secret registers also carry the potential of being misused to help a hacker to retrieve secret keys. In this paper, we first present a scan-based side channel attack method on AES by making use of the round counter registers, which are not paid attention to in previous works, to show the potential security threat in designs with scan chains. And then we discussed the issues of secure DFT requirements and proposed a secure scan scheme to preserve all the advantages and simplicities of traditional scan test, while significantly improve the security with ignorable design overhead, for crypto hardware implementations.

  • Floorplan Driven Architecture and High-Level Synthesis Algorithm for Dynamic Multiple Supply Voltages

    Shin-ya ABE  Youhua SHI  Kimiyoshi USAMI  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E96-A No:12
      Page(s):
    2597-2611

    In this paper, we propose an adaptive voltage huddle-based distributed-register architecture (AVHDR architecture), which integrates dynamic multiple supply voltages and interconnection delay into high-level synthesis. In AVHDR architecture, voltages can be dynamically assigned for energy reduction. In other words, low supply voltages are assigned to non-critical operations, and leakage power is cut off by turning off the power supply to the sleeping functional units. Next, an AVHDR-based high-level synthesis algorithm is proposed. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, the modules in each huddle can be placed close to each other and the corresponding AVHDR architecture can be generated and optimized with floorplanning information. Experimental results show that on average our algorithm achieves 43.9% energy-saving compared with conventional algorithms.

  • X-Handling for Current X-Tolerant Compactors with More Unknowns and Maximal Compaction

    Youhua SHI  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Logic Synthesis, Test and Verfication

      Vol:
    E92-A No:12
      Page(s):
    3119-3127

    This paper presents a novel X-handling technique, which removes the effect of unknowns on compacted test response with maximal compaction ratio. The proposed method combines with the current X-tolerant compactors and inserts masking cells on scan paths to selectively mask X's. By doing this, the number of unknown responses in each scan-out cycle could be reduced to a reasonable level such that the target X-tolerant compactor would tolerate with guaranteed possible error detection. It guarantees no test loss due to the effect of X's, and achieves the maximal compaction that the target response compactor could provide as well. Moreover, because the masking cells are only inserted on the scan paths, it has no performance degradation of the designs. Experimental results demonstrate the effectiveness of the proposed method.

  • A Selective Scan Chain Reconfiguration through Run-Length Coding for Test Data Compression and Scan Power Reduction

    Youhua SHI  Shinji KIMURA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Test

      Vol:
    E87-A No:12
      Page(s):
    3208-3215

    Test data volume and power consumption for scan-based designs are two major concerns in system-on-a-chip testing. However, test set compaction by filling the don't-cares will invariably increase the scan-in power dissipation for scan testing, then the goals of test data reduction and low-power scan testing appear to be conflicted. Therefore, in this paper we present a selective scan chain reconfiguration method for test data compression and scan-in power reduction. The proposed method analyzes the compatibility of the internal scan cells for a given test set and then divides the scan cells into compatible classes. After the scan chain reconfiguration a dictionary is built to indicate the run-length of each compatible class and only the scan-in data for each class should be transferred from the ATE to the CUT so as to reduce test data volume. Experimental results for the larger ISCAS'89 benchmarks show that the proposed approach overcomes the limitations of traditional run-length coding techniques, and leads to highly reduced test data volume with significant power savings during scan testing in all cases.

  • A Hybrid Dictionary Test Data Compression for Multiscan-Based Designs

    Youhua SHI  Shinji KIMURA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Test

      Vol:
    E87-A No:12
      Page(s):
    3193-3199

    In this paper, we present a test data compression technique to reduce test data volume for multiscan-based designs. In our method the internal scan chains are divided into equal sized groups and two dictionaries were build to encode either an entire slice or a subset of the slice. Depending on the codeword, the decompressor may load all scan chains or may load only a group of the scan chains, which can enhance the effectiveness of dictionary-based compression. In contrast to previous dictionary coding techniques, even for the CUT with a large number of scan chains, the proposed approach can achieve satisfied reduction in test data volume with a reasonable smaller dictionary. Experimental results showed the proposed test scheme works particularly well for the large ISCAS'89 benchmarks.

  • Unified Dual-Radix Architecture for Scalable Montgomery Multiplications in GF(P) and GF(2n)

    Kazuyuki TANIMURA  Ryuta NARA  Shunitsu KOHARA  Youhua SHI  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E92-A No:9
      Page(s):
    2304-2317

    Modular multiplication is the most dominant arithmetic operation in elliptic curve cryptography (ECC), that is a type of public-key cryptography. Montgomery multiplier is commonly used to compute the modular multiplications and requires scalability because the bit length of operands varies depending on its security level. In addition, ECC is performed in GF(P) or GF(2n), and unified architecture for multipliers in GF(P) and GF(2n) is required. However, in previous works, changing frequency is necessary to deal with delay-time difference between GF(P) and GF(2n) multipliers because the critical path of the GF(P) multiplier is longer. This paper proposes unified dual-radix architecture for scalable Montgomery multiplications in GF(P) and GF(2n). This proposed architecture unifies four parallel radix-216 multipliers in GF(P) and a radix-264 multiplier in GF(2n) into a single unit. Applying lower radix to GF(P) multiplier shortens its critical path and makes it possible to compute the operands in the two fields using the same multiplier at the same frequency so that clock dividers to deal with the delay-time difference are not required. Moreover, parallel architecture in GF(P) reduces the clock cycles increased by dual-radix approach. Consequently, the proposed architecture achieves to compute a GF(P) 256-bit Montgomery multiplication in 0.28 µs. The implementation result shows that the area of the proposal is almost the same as that of previous works: 39 kgates.

  • A Built-in Reseeding Technique for LFSR-Based Test Pattern Generation

    Youhua SHI  Zhe ZHANG  Shinji KIMURA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Timing Verification and Test Generation

      Vol:
    E86-A No:12
      Page(s):
    3056-3062

    Reseeding technique is proposed to improve the fault coverage in pseudo-random testing. However most of previous works on reseeding is based on storing the seeds in an external tester or in a ROM. In this paper we present a built-in reseeding technique for LFSR-based test pattern generation. The proposed structure can run both in pseudorandom mode and in reseeding mode. Besides, our method requires no storage for the seeds since in reseeding mode the seeds can be generated automatically in hardware. In this paper we also propose an efficient grouping algorithm based on simulated annealing to optimize test vector grouping. Experimental results for benchmark circuits indicate the superiority of our technique against other reseeding methods with respect to test length and area overhead. Moreover, since the theoretical properties of LFSRs are preserved, our method could be beneficially used in conjunction with any other techniques proposed so far.

  • A Unified Test Compression Technique for Scan Stimulus and Unknown Masking Data with No Test Loss

    Youhua SHI  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Logic Synthesis, Test and Verification

      Vol:
    E91-A No:12
      Page(s):
    3514-3523

    This paper presents a unified test compression technique for scan stimulus and unknown masking data with seamless integration of test generation, test compression and all unknown response masking for high quality manufacturing test cost reduction. Unlike prior test compression methods, the proposed approach considers the unknown responses during test pattern generation procedure, and then selectively encodes the less specified bits (either 1s or 0s) in each scan slice for compression while at the same time masks the unknown responses before sending them to the response compactor. The proposed test scheme could dramatically reduce test data volume as well as the number of required test channels by using only c tester channels to drive N internal scan chains, where c = 「 log 2N 」 + 2. In addition, because all the unknown responses could be exactly masked before entering into the response compactor, test loss due to unknown responses would be eliminated. Experimental results on both benchmark circuits and larger designs indicated the effectiveness of the proposed technique.

  • Faithfully Truncated Adder-Based Area-Power Efficient FIR Design with Predefined Output Accuracy

    Jinghao YE  Masao YANAGISAWA  Youhua SHI  

     
    PAPER

      Vol:
    E103-A No:9
      Page(s):
    1063-1070

    To solve the area and power problems in Finite Impulse Response (FIR) implementations, a faithfully truncated adder-based FIR design is presented in this paper for significant area and power savings while the predefined output accuracy can still be obtained. As a solution to the accuracy loss caused by truncated adders, a static error analysis on the utilization of truncated adders in FIRs was performed. According to the mathematical analysis, we show that, with a given accuracy constraint, the optimal truncated adder configuration for an area-power efficient FIR design can be effortlessly determined. Evaluation results on various FIR implementations by using the proposed faithfully truncated adder designs showed that up to 35.4% and 27.9% savings in area and power consumption can be achieved with less than 1 ulp accuracy loss for uniformly distributed random inputs. Moreover, as a case study for normally distributed signals, a fixed 6-tap FIR is implemented for electrocardiogram (ECG) signal filtering was implemented, in which even with the increased truncated bits up to 10, the mean absolute error (Ē) can be guaranteed to be less than 1 ulp while up to 29.7% and 25.3% savings in area and power can be obtained.

  • A Hardware-Trojans Identifying Method Based on Trojan Net Scoring at Gate-Level Netlists

    Masaru OYA  Youhua SHI  Noritaka YAMASHITA  Toshihiko OKAMURA  Yukiyasu TSUNOO  Satoshi GOTO  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-Logic Synthesis, Test and Verification

      Vol:
    E98-A No:12
      Page(s):
    2537-2546

    Outsourcing IC design and fabrication is one of the effective solutions to reduce design cost but it may cause severe security risks. Particularly, malicious outside vendors may implement Hardware Trojans (HTs) on ICs. When we focus on IC design phase, we cannot assume an HT-free netlist or a Golden netlist and it is too difficult to identify whether a given netlist is HT-free or not. In this paper, we propose a score-based hardware-trojans identifying method at gate-level netlists without using a Golden netlist. Our proposed method does not directly detect HTs themselves in a gate-level netlist but it detects a net included in HTs, which is called Trojan net, instead. Firstly, we observe Trojan nets from several HT-inserted benchmarks and extract several their features. Secondly, we give scores to extracted Trojan net features and sum up them for each net in benchmarks. Then we can find out a score threshold to classify HT-free and HT-inserted netlists. Based on these scores, we can successfully classify HT-free and HT-inserted netlists in all the Trust-HUB gate-level benchmarks and ISCAS85 benchmarks as well as HT-free and HT-inserted AES gate-level netlists. Experimental results demonstrate that our method successfully identify all the HT-inserted gate-level benchmarks to be “HT-inserted” and all the HT-free gate-level benchmarks to be “HT-free” in approximately three hours for each benchmark.

  • An Energy-Efficient Floorplan Driven High-Level Synthesis Algorithm for Multiple Clock Domains Design

    Shin-ya ABE  Youhua SHI  Kimiyoshi USAMI  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E98-A No:7
      Page(s):
    1376-1391

    In this paper, we first propose an HDR-mcd architecture, which integrates periodically all-in-phase based multiple clock domains and multi-cycle interconnect communication into high-level synthesis. In HDR-mcd, an entire chip is divided into several huddles. Huddles can realize synchronization between different clock domains in which interconnection delay should be considered during high-level synthesis. Next, we propose a high-level synthesis algorithm for HDR-mcd, which can reduce energy consumption by optimizing configuration and placement of huddles. Experimental results show that the proposed method achieves 32.5% energy-saving compared with the existing single clock domain based methods.

  • A Low Power Soft Error Hardened Latch with Schmitt-Trigger-Based C-Element

    Saki TAJIMA  Nozomu TOGAWA  Masao YANAGISAWA  Youhua SHI  

     
    PAPER

      Vol:
    E101-A No:7
      Page(s):
    1025-1034

    To deal with the reliability issue caused by soft errors, this paper proposed a low power soft error hardened latch (SHC) design using a novel Schmitt-Trigger-based C-element for reliable low power applications. Unlike state-of-the-art soft error tolerant latches that are usually based on hardware redundancy with large area overhead and high power consumption, the proposed SHC latch is implemented through double-sampling and node-checking using a novel Schmitt-Trigger-based C-element, which can help to reduce the area overhead and the corresponding power consumption as well. The evaluation results show that the total number of transistors of the proposed SHC latch is only increased by 2 when compared to the conventional unhardened C2MOS latch, while up to 20.35% and 82.96% power reduction can be achieved when compared to the conventional unhardened C2MOS latch and the existing soft error tolerant HiPeR design, respectively.

  • A Secure Test Technique for Pipelined Advanced Encryption Standard

    Youhua SHI  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    LETTER

      Vol:
    E91-D No:3
      Page(s):
    776-780

    In this paper, we presented a Design-for-Secure-Test (DFST) technique for pipelined AES to guarantee both the security and the test quality during testing. Unlike previous works, the proposed method can keep all the secrets inside and provide high test quality and fault diagnosis ability as well. Furthermore, the proposed DFST technique can significantly reduce test application time, test data volume, and test generation effort as additional benefits.

  • Extension and Performance/Accuracy Formulation for Optimal GeAr-Based Approximate Adder Designs

    Ken HAYAMIZU  Nozomu TOGAWA  Masao YANAGISAWA  Youhua SHI  

     
    PAPER

      Vol:
    E101-A No:7
      Page(s):
    1014-1024

    Approximate computing is a promising solution for future energy-efficient designs because it can provide great improvements in performance, area and/or energy consumption over traditional exact-computing designs for non-critical error-tolerant applications. However, the most challenging issue in designing approximate circuits is how to guarantee the pre-specified computation accuracy while achieving energy reduction and performance improvement. To address this problem, this paper starts from the state-of-the-art general approximate adder model (GeAr) and extends it for more possible approximate design candidates by relaxing the design restrictions. And then a maximum-error-distance-based performance/accuracy formulation, which can be used to select the performance/energy-accuracy optimal design from the extended design space, is proposed. Our evaluation results show the effectiveness of the proposed method in terms of area overhead, performance, energy consumption, and computation accuracy.

  • Selective Low-Care Coding: A Means for Test Data Compression in Circuits with Multiple Scan Chains

    Youhua SHI  Nozomu TOGAWA  Shinji KIMURA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER

      Vol:
    E89-A No:4
      Page(s):
    996-1004

    This paper presents a test input data compression technique, Selective Low-Care Coding (SLC), which can be used to significantly reduce input test data volume as well as the external test channel requirement for multiscan-based designs. In the proposed SLC scheme, we explored the linear dependencies of the internal scan chains, and instead of encoding all the specified bits in test cubes, only a smaller amount of specified bits are selected for encoding, thus greater compression can be expected. Experiments on the larger benchmark circuits show drastic reduction in test data volume with corresponding savings on test application time can be indeed achieved even for the well-compacted test set.