Atsushi KUROKAWA Takashi SATO Hiroo MASUDA
We present a new and efficient approach for extracting on-chip mutual inductances of VLSI interconnects by applying approximation formulae. The equations are based on the assumption of filaments or bars of finite width and zero thickness and are derived through Taylor's expansion of the exact formula for mutual inductance between filaments. Despite the assumption of uniform current density in each of the bars, the model is sufficiently accurate for the interconnections of current and future LSIs because the skin and proximity effects do not affect most wires. Expression of the equations in polynomial form provides a balance between accuracy and computational complexity. These equations are mapped according to the geometric structures for which they are most suitable in minimizing the runtime of inductance calculation while retaining the required accuracy. Within geometrical constraints, the wires are of arbitrary specification. Results of a comprehensive evaluation based on the ITRS-specified global wiring structure for 2003 shows that the inductance values were extracted by using the proposed approach, and they were within several percent of the values obtained by using commercial three-dimensional (3-D) field solvers. The efficiency of the proposed approach is also demonstrated by extraction from a real layout design that has 300-k interconnecting segments.
Michihiro SHINTANI Takashi SATO
We propose a novel IDDQ outlier screening flow through a two-phase approach: a clustering-based filtering and an estimation-based current-threshold determination. In the proposed flow, a clustering technique first filters out chips that have high IDDQ current. Then, in the current-threshold determination phase, device-parameters of the unfiltered chips are estimated based on measured IDDQ currents through Bayesian inference. The estimated device-parameters will further be used to determine a statistical leakage current distribution for each test pattern and to calculate a and suitable current-threshold. Numerical experiments using a virtual wafer show that our proposed technique is 14 times more accurate than the neighbor nearest residual (NNR) method and can achieve 80% of the test escape in the case of small leakage faults whose ratios of leakage fault sizes to the nominal IDDQ current are above 40%.
We define the communication complexity of a perfect zero-knowledge interactive proof (ZKIP) as the expected number of bits communicated to achieve the given error probabilities (of both the completeness and the soundness). While the round complexity of ZKIPs has been studied greatly, no progress has been made for the communication complexity of those. This paper shows a perfect ZKIP whose communication complexity is 11/12 of that of the standard perfect ZKIP for a specific class of Quadratic Residuosity.
Shumpei MORITA Song BIAN Michihiro SHINTANI Masayuki HIROMOTO Takashi SATO
Replacement of highly stressed logic gates with internal node control (INC) logics is known to be an effective way to alleviate timing degradation due to NBTI. We propose a path clustering approach to accelerate finding effective replacement gates. Upon the observation that there exist paths that always become timing critical after aging, critical path candidates are clustered to select representative path in each cluster. With efficient data structure to further reduce timing calculation, INC logic optimization has first became tractable in practical time. Through the experiments using a processor, 171x speedup has been demonstrated while retaining almost the same level of mitigation gain.
Takashi IMAGAWA Masayuki HIROMOTO Hiroyuki OCHI Takashi SATO
This paper proposes a reliability evaluation environment for coarse-grained reconfigurable architectures. This environment is designed so that it can be easily extended to different target architectures and applications by automating the generation of the simulation inputs such as HDL codes for fault injection and configuration information. This automation enables us to explore a huge design space in order to efficiently analyze area/reliability trade-offs and find the best solution. This paper also shows demonstrative examples of the design space exploration of coarse-grained reconfigurable architectures using the proposed environment. Through the demonstrations, we discuss relationship between coarse-grained architectures and reliability, which has not yet been addressed in existing literatures and show the feasibility of the proposed environment.
Toshiki KANAMOTO Takaaki OKUMURA Katsuhiro FURUKAWA Hiroshi TAKAFUJI Atsushi KUROKAWA Koutaro HACHIYA Tsuyoshi SAKATA Masakazu TANAKA Hidenari NAKASHIMA Hiroo MASUDA Takashi SATO Masanori HASHIMOTO
This paper evaluates impact of self-heating in wire interconnection on signal propagation delay in an upcoming 32 nm process technology, using practical physical parameters. This paper examines a 64-bit data transmission model as one of the most heating cases. Experimental results show that the maximum wire temperature increase due to the self-heating appears in the case where the ratio of interconnect delay becomes largest compared to the driver delay. However, even in the most significant case which induces the maximum temperature rise of 11.0, the corresponding increase in the wire resistance is 1.99% and the resulting delay increase is only 1.15%, as for the assumed 32 nm process. A part of the impact reduction of wire self-heating on timing comes from the size-effect of nano-scale wires.
Takashi ENAMI Takashi SATO Masanori HASHIMOTO
We propose an optimization method for power distribution network that explicitly deals with timing. We have found and focused on the facts that decoupling capacitance (decap) does not necessarily improve gate delay depending on the switching timing within a cycle and that power wire expansion may locally degrade the voltage. To resolve the above facts, we devised an efficient sensitivity calculation of timing to decap size and power wire width for guiding optimization. The proposed method, which is based on statistical noise modeling and timing analysis, accelerates sensitivity calculation with an approximation and adjoint sensitivity analysis. Experimental results show that decap allocation based on the sensitivity analysis efficiently minimizes the worst-case circuit delay within a given decap budget. Compared to the maximum decap placement, the delay improvement due to decap increases by 3.13% even while the total amount of decaps is reduced to 40%. The wire sizing with the proposed method also efficiently reduces required wire resource necessary to attain the same circuit delay by 11.5%.
Shiho HAGIWARA Takanori DATE Kazuya MASU Takashi SATO
This paper proposes a novel and an efficient method termed hypersphere sampling to estimate the circuit yield of low-failure probability with a large number of variable sources. Importance sampling using a mean-shift Gaussian mixture distribution as an alternative distribution is used for yield estimation. Further, the proposed method is used to determine the shift locations of the Gaussian distributions. This method involves the bisection of cones whose bases are part of the hyperspheres, in order to locate probabilistically important regions of failure; the determination of these regions accelerates the convergence speed of importance sampling. Clustering of the failure samples determines the required number of Gaussian distributions. Successful static random access memory (SRAM) yield estimations of 6- to 24-dimensional problems are presented. The number of Monte Carlo trials has been reduced by 2-5 orders of magnitude as compared to conventional Monte Carlo simulation methods.
Hiromitsu AWANO Hiroshi TSUTSUI Hiroyuki OCHI Takashi SATO
Random telegraph noise (RTN) is a phenomenon that is considered to limit the reliability and performance of circuits using advanced devices. The time constants of carrier capture and emission and the associated change in the threshold voltage are important parameters commonly included in various models, but their extraction from time-domain observations has been a difficult task. In this study, we propose a statistical method for simultaneously estimating interrelated parameters: the time constants and magnitude of the threshold voltage shift. Our method is based on a graphical network representation, and the parameters are estimated using the Markov chain Monte Carlo method. Experimental application of the proposed method to synthetic and measured time-domain RTN signals was successful. The proposed method can handle interrelated parameters of multiple traps and thereby contributes to the construction of more accurate RTN models.
Hiroshi YUASA Hiroshi TSUTSUI Hiroyuki OCHI Takashi SATO
We propose a novel acceleration scheme for Monte Carlo based statistical static timing analysis (MC-SSTA). MC-SSTA, which repeatedly executes ordinary STA using a set of randomly generated gate delay samples, is widely accepted as an accuracy reference. A large number of random samples, however, should be processed to obtain accurate delay distributions, and software implementation of MC-SSTA, therefore, takes an impractically long processing time. In our approach, a generalized hardware module, the STA processing element (STA-PE), is used for the delay evaluation of a logic gate, and netlist-specific information is delivered in the form of instructions from an SRAM. Multiple STA-PEs can be implemented for parallel processing, while a larger netlist can be handled if only a larger SRAM area is available. The proposed scheme is successfully implemented on Altera's Arria II GX EP2AGX125EF35C4 device in which 26 STA-PEs and a 624-port Mersenne Twister-based random number generator run in parallel at a 116 MHz clock rate. A speedup of far more than10 is achieved compared to conventional methods including GPU implementation.
Takashi IMAGAWA Masayuki HIROMOTO Hiroyuki OCHI Takashi SATO
Time redundancy is sometimes an only option for enhancing circuit reliability when the circuit area is severely restricted. In this paper, a time-redundant error-correction scheme, which is particularly suitable for coarse-grained reconfigurable arrays (CGRAs), is proposed. It judges the correctness of the executions by comparing the results of two identical runs. Once a mismatch is found, the second run is terminated immediately to start the third run, under the assumption that the errors tend to persist in many applications, for selecting the correct result in the three runs. The circuit area and reliability of the proposed method is compared with a straightforward implementation of time-redundancy and a selective triple modular redundancy (TMR). A case study on a CGRA revealed that the area of the proposed method is 1% larger than that of the implementation for the selective TMR. The study also shows the proposed scheme is up to 2.6x more reliable than the full-TMR when the persistent error is predominant.