The issue of copying values or references has historically been studied for managing memory objects, especially in distributed systems. In this paper, we explore a new topic on copying values v.s. references, for memory page compaction on virtualized systems. Memory page compaction moves target physical pages to a contiguous memory region at the operating system kernel level to create huge pages. Memory virtualization provides an opportunity to perform memory page compaction by copying the references of the physical pages. That is, instead of copying pages' values, we can move guest physical pages by changing the mappings of guest-physical to machine-physical pages. The goal of this paper is a quantitative comparison between value- and reference-based memory page compaction. To do so, we developed a software mechanism that achieves memory page compaction by appropriately updating the references of guest-physical pages. We prototyped the mechanism on Linux 4.19.29 and the experimental results show that the prototype's page compaction is up to 78% faster and achieves up to 17% higher performance on the memory-intensive real-world applications as compared to the default value-copy compaction scheme.
Wenpo ZHANG Kazuteru NAMBA Hideo ITO
In recent VLSIs, small-delay defects, which are hard to detect by traditional delay fault testing, can bring about serious issues such as short lifetime. To detect small-delay defects, on-chip delay measurement which measures the delay time of paths in the circuit under test (CUT) was proposed. However, this approach incurs high test cost because it uses scan design, which brings about long test application time due to scan shift operation. Our solution is a test application time reduction method for testing using the on-chip path delay measurement. The testing with on-chip path delay measurement does not require capture operations, unlike the conventional delay testing. Specifically, FFs keep the transition pattern of the test pattern pair sensitizing a path under measurement (PUM) (denoted as p) even after the measurement of p. The proposed method uses this characteristic. The proposed method reduces scan shift time and test data volume using test pattern merging. Evaluation results on ISCAS89 benchmark circuits indicate that the proposed method reduces the test application time by 6.89∼62.67% and test data volume by 46.39∼74.86%.
Hiroshi YAMAZAKI Motohiro WAKAZONO Toshinori HOSOKAWA Masayoshi YOSHIMURA
In recent years, the growing density and complexity of VLSIs have led to an increase in the numbers of test patterns and fault models. Test patterns used in VLSI testing are required to provide high quality and low cost. Don't care (X) identification techniques and X-filling techniques are methods to satisfy these requirements. However, conventional X-identification techniques are less effective for application-specific fields such as test compaction because the X-bits concentrate on particular primary inputs and pseudo primary inputs. In this paper, we propose a don't care identification method for test compaction. The experimental results for ITC'99 and ISCAS'89 benchmark circuits show that a given test set can be efficiently compacted by the proposed method.
Kohei MIYASE Xiaoqing WEN Hiroshi FURUKAWA Yuta YAMATO Seiji KAJIHARA Patrick GIRARD Laung-Terng WANG Mohammad TEHRANIPOOR
At-speed scan testing is susceptible to yield loss risk due to power supply noise caused by excessive launch switching activity. This paper proposes a novel two-stage scheme, namely CTX (Clock-Gating-Based Test Relaxation and X-Filling), for reducing switching activity when a test stimulus is launched. Test relaxation and X-filling are conducted (1) to make as many FFs as possible inactive by disabling corresponding clock control signals of clock-gating circuitry in Stage-1 (Clock-Disabling), and (2) to equalize the input and output values in Stage-2 of as many remaining active FFs as possible (FF-Silencing). CTX effectively reduces launch switching activity and thus yield loss risk even when only a small number of don't care (X) bits are present (as in test compression) without any impact on test data volume, fault coverage, performance, or circuit design.
Masayuki ARAI Satoshi FUKUMOTO Kazuhiko IWASAKI
Convolutional compactors offer a promising technique of compacting test responses. In this study we expand the architecture of convolutional compactor onto a Galois field in order to improve compaction ratio as well as reduce X-masking probability, namely, the probability that an error is masked by unknown values. While each scan chain is independently connected by EOR gates in the conventional arrangement, the proposed scheme treats q signals as an element over GF(2q), and the connections are configured on the same field. We show the arrangement of the proposed compactors and the equivalent expression over GF(2). We then evaluate the effectiveness of the proposed expansion in terms of X-masking probability by simulations with uniform distribution of X-values, as well as reduction of hardware overheads. Furthermore, we evaluate a multi-weight arrangement of the proposed compactors for non-uniform X distributions.
Masayuki ARAI Satoshi FUKUMOTO Kazuhiko IWASAKI Tatsuru MATSUO Takahisa HIRAIDE Hideaki KONISHI Michiaki EMORI Takashi AIKYO
We developed test data compression scheme for scan-based BIST, aiming to compress test stimuli and responses by more than 100 times. As scan-BIST architecture, we adopt BIST-Aided Scan Test (BAST), and combines four techniques: the invert-and-shift operation, run-length compression, scan address partitioning, and LFSR pre-shifting. Our scheme achieved a 100x compression rate in environments where Xs do not occur without reducing the fault coverage of the original ATPG vectors. Furthermore, we enhanced the masking logic to reduce data for X-masking so that test data is still compressed to 1/100 in a practical environment where Xs occur. We applied our scheme to five real VLSI chips, and the technique compressed the test data by 100x for scan-based BIST.
Hidenari NAKASHIMA Junpei INOUE Kenichi OKADA Kazuya MASU
Interconnect Length Distribution (ILD) represents the correlation between the number of interconnects and their length. The ILD can predict power consumption, clock frequency, chip size, etc. High core utilization and small circuit area have been reported to improve chip performance. We propose an ILD model to predict the correlation between core utilization and chip performance. The proposed model predicts the influences of interconnect length and interconnect density on circuit performances. As core utilization increases, small and simple circuits improve the performances. In large complex circuits, decreasing the wire coupling capacitance is more important than decreasing the total interconnect length for improvement of chip performance. The proposed ILD model expresses the actual ILD more accurately than conventional models.
Hiroaki ITOGA Chikaaki KODAMA Kunihiro FUJIYOSHI
In the VLSI layout design, a floorplan is often obtained to define rough arrangement of modules in the early design stage. In the stage, the aspect ratio of each soft module is also determined. The aspect ratio can be changed in the designated range keeping its area of each module. In this paper, in order to determine the aspect ratio, we propose a graph-based one dimensional compaction method which determines the aspect ratio quickly under the constraint that topology of a floorplan must not be changed. The proposed method is divided into two steps: (1) Selection of a minimal set of soft modules to adjust the aspect ratio. (2) Decision on the aspect ratio. (1) is formulated as the minimal cut problem in graph theory. We solve the problem by transforming it to the shortest path problem. (2) is divided into two operations. One is to determine the increment limit in height or width of each soft module and the other is to determine the aspect ratio of each soft module by Newton-Raphson method. The experimental comparisons show effectiveness of the proposed method.
Chih-Yang HSU Chien-Nan Jimmy LIU Jing-Yang JOU
For large circuits, vector compaction techniques could provide a faster solution for power estimation with reasonable accuracy. Because traditional sampling approach will incur useless transitions between every sampled pattern pairs after they are concatenated into a single sequence for simulation, we proposed a vector compaction method with grouping and single-sequence consecutive sampling technique to solve this problem. However, it is very possible that we cannot find a perfect consecutive sequence without any undesired transitions. In such cases, the compaction ratio of the sequence length may not be improved too much. In this paper, we propose an efficient approach to relax the limitation a little bit such that multiple consecutive sequences are allowed. We also propose an algorithm to reduce the number of sequences instead of setting the number as one to find better solutions for vector compaction problem. As demonstrated in the experimental results, the average compaction ratio and speedup can be significantly improved by using this new approach.
In this paper, we present a novel energy compaction method, called the selective block-wise reordering, which is used with SPIHT (SBR-SPIHT) coding for low rate video coding to enhance the coding efficiency for motion-compensated residuals. In the proposed coding system, the motion estimation and motion compensation schemes of H.263 are used to reduce the temporal redundancy. The residuals are then wavelet transformed. The block-mapping reorganization utilizes the wavelet zerotree relationship that jointly presents the wavelet coefficients from the lowest subband to high frequency subbands at the same spatial location, and allocates each wavelet tree with all descendents to form a wavelet block. The selective multi-layer block-wise reordering technique is then applied to those wavelet blocks that have energy higher than a threshold to enhance the energy compaction by rearranging the significant pixels in a block to the upper left corner based on the magnitude of energy. An improved SPIHT coding is then applied to each wavelet block, either re-ordered or not. The high energy compaction resulting from the block reordering can reduce the number of redundant bits in the sorting pass and improve the quantization efficiency in the refinement pass of SPIHT coding. Simulation results demonstrate that SBR-SPIHT outperforms H.263 by 1.28-0.69 dB on average for various video sequences at very low bit-rates, ranging from 48 to 10 kbps.
In this paper, we propose a new method of automatic speech summarization for each utterance, where a set of words that maximizes a summarization score is extracted from automatic speech transcriptions. The summarization score indicates the appropriateness of summarized sentences. This extraction is achieved by using a dynamic programming technique according to a target summarization ratio. This ratio is the number of characters/words in the summarized sentence divided by the number of characters/words in the original sentence. The extracted set of words is then connected to build a summarized sentence. The summarization score consists of a word significance measure, linguistic likelihood, and a confidence measure. This paper also proposes a new method of measuring summarization accuracy based on a word network expressing manual summarization results. The summarization accuracy of each automatic summarization is calculated by comparing it with the most similar word string in the network. Japanese broadcast-news speech, transcribed using a large-vocabulary continuous-speech recognition (LVCSR) system, is summarized and evaluated using our proposed method with 20, 40, 60, 70 and 80% summarization ratios. Experimental results reveal that the proposed method can effectively extract relatively important information by removing redundant or irrelevant information.
Hiroaki SUZUKI Hiroyuki KAWAI Hiroshi MAKINO Yoshio MATSUDA
A VLIW (Very Long Instruction Word) architecture with a new code compaction method has been proposed. For a 3D-geometry processor, we consider two types of 2-issue VLIW architectures, the floating-point execution accelerating VLIW (FP-VLIW) and the data-move enhancing VLIW (MV-VLIW) architectures, as expansions of a Single-Streaming Single Instruction, Multiple Data (SS-SIMD) architecture. To solve the code bloat problem which is common to VLIW architectures, the proposed method makes it possible to compact original codes into the VLIW codes by software tools and decompact the VLIW codes by a simple hardware decompactor composed of an instruction swap circuit on a chip. Speeds and code densities of the two VLIWs with the code compaction are compared to the SS-SIMD with the same instruction set and the same building blocks. The FP-VLIW shows the fastest speed performance in the evaluation results of the viewperf CDRS-03 benchmark programs. It is 36% faster than the SS-SIMD used as reference. The proposed compaction method keeps the 95% code density of the SS-SIMD. One test program shows that the code density of the MV-VLIW is higher than that of the SS-SIMD. This result demonstrates that the merit of compacting nops can be greater than the VLIW penalty. The FP-VLIW architecture with the code compaction achieves 1.36 times the speed performance without significant code-density deterioration.
Tsuyoshi SHINOGI Terumine HAYASHI
IDDQ testing, or current testing, is a powerful method which detects a large class of defects which cause abnormal quiescent current, by measuring the power supply current. One of the problems on IDDQ testing which prevent its full practical use in manufacturing is that the testing speed is slow owing to time-consuming IDDQ measurement. One of the solutions to this problem is test pattern compaction. This paper presents an efficient method for generating a compact test set for IDDQ testing of bridging faults in combinational CMOS circuits. Our method is based on the iterative improvement method. Each of random primary input patterns is iteratively improved through changing its values pin by pin selected orderly, so as to increase the number of newly detected faults in the current yet undetected fault set. While our method is simple and easy to implement, it is efficient. Experimental results for large ISCAS benchmark circuits demonstrate its efficiency in comparison with results of previous methods.
Radu MARCULESCU Diana MARCULESCU Massoud PEDRAM
This paper presents an effective and robust technique for compacting a large sequence of input vectors into a much smaller input sequence so as to reduce the circuit/gate level simulation time by orders of magnitude and maintain the accuracy of the power estimates. In particular, this paper introduces and characterizes a family of dynamic Markov trees that can model complex spatiotemporal correlations which occur during power estimation both in combinational and sequential circuits. As the results demonstrate, large compaction ratios of 1-2 orders of magnitude can be obtained without significant loss (less than 5% on average) in the accuracy of power estimates.
Pitchmatching algorithms are widely used in layout environments where no grid constraints are imposed. However, realistic layouts include multiple grid constraints which facilitate the applications of automatic routing. Hence, pitchmatching algorithms should be extended to those realistic layouts. This paper formulates a pitchmatching problem with multiple grid constraints. An algorithm for solving this problem is constructed as an extension of conventional pitchmatching algorithms. The computational complexity is also discussed in comparison with a conventional naive algorithm. Finally, examples and application results to realistic layouts are presented.
Kazuhisa OKADA Hidetoshi ONODERA Keikichi TAMURA
We propose a new compaction problem that allows layout elements to have many shape possibilities. The objective of the problem is to find not only positions but also shapes of layout elements. We present an efficient method to solve the problem--compaction with shape optimization. This method simplifies the problem by considering the optimization of shapes only for the layout elements on a critical path. The layout is compacted step by step while optimizing the shapes of layout elements. Another importance of this compaction technique is that it makes layout to be "recyclable" for other set of device parameters. The experimental examples, which attempt shape optimization and recycle of analog layout, confirms the importance and efficiency of our method.
Masahiko TOYONAGA Chie IWASAKI Yoshiaki SAWADA Toshiro AKINO
We present a new multi-layer over-the-cell channel router for standard cell layout design using simulated annealing. This new approach, STANZA-M consists of two key features. The first key feature of our router is a new scheme for simulated annealing in which we use a cost function to evaluate both the total net-length and the channel heights, and an effective simulated annealing process by a limited range to obtain an optimal chnnel wiring in practical time. The second feature of our router is a basic layer assignment procedure in which we assign all horizontal wiring inside a channel to feasible layers by considering the height of channel including cell region with a one dimensional channel compaction process. We implemented our three-layer cannel router in C language on a Solbourne Series 5 Work Station (22 MIPS). Experimental results for benchmarks such as Deutsch's Difficult Example and MCNC's PRIMARY1 channel routing problems indicate that STANZA-M can achieve superior results compared to the conventional routers, and the process times are very fast despite the use of simulated annealing.
This paper describes a procedural detailed compaction method for the symbolic layout design of CMOS leaf cells and its algorithmic aspects. Simple symbolic representations that are loosely designed by users in advance are automatically converted into densely compacted physical patterns in two phases: symbolic–to–pattern conversion and segment–based detailed compaction. Both phases are executed using user-defined procedures and a specified set of design rules. The detailed compaction utilizes a segment–based constraint graph generated by an extended plane sweep method where various kinds of design rules can be applied. Since various kinds of basic operations can be applied to the individual segments of patterns in the procedures, the detailed procedure for processing can be described in accordance with fabrication process technologies and the corresponding sets of design rules. This combined stepwise procedure provides a highly flexible framework for the symbolic layout of CMOS leaf cells. The proposed approach was implemented in a symbolic layout system called CAMEL. To date, more than 300 kinds of symbolic representations of CMOS leaf cells have been designed and are stored in the database. Using several different sets of design rules, symbolic representations have been automatically converted into compacted patterns without design rule violations. The areas of those generated patterns were averaged at 98% of the manually designed patterns. Even in the worst case, the increases in area were less than about 10% of the manually designed ones. Furthermore, since processing times are much shorter than manual design periods, for example, 300 kinds of symbolic representations can be converted to corresponding physical patterns in only a day. It is evident, through these practical design experiences with CAMEL, that our approach is more flexible and process–tolerant than conventional ones.
This paper describes a preconstrained compaction method and its application to the direct design-rule conversion of CMOS layouts. This approach can convert already designed physical patterns into compacted layouts that satisfy user-specified design rules. Furthermore, preconstrained compaction can eliminate unnecessarily extended diffusion areas and polysilicon wires which tend to be created with conventional longest path based compactions. Preconstrained compaction can be constructed by combining a longest path algorithm with forward and backward slack processes and a preconstraint generation process. This contrasts with previously proposed approaches based on longest path algorithms followed by iterative improvement processes, which include applications of linear programming. The layout styles in those approaches are usually limited to a model where fixed-shaped rectilinear blocks are moved so as to minimize the total length of rectilinear interconnections among the blocks. However, preconstrained compaction can be applied to reshaping polygonal patterns such as diffusion and channel areas. Thus, this compaction method makes it possible to reuse CMOS leaf and macro cell layouts even if design rules change. The proposed preconstrained compaction approach has been applied to direct design-rule conversion from 0.8-µm to 0.5-µm rules of CMOS layouts containing from several to 10,195 transistors. Experimental results demonstrate that a 10.6% reduction in diffusion areas can be achieved without unnecessary extensions of polysilicon wires with a 39% increase in processing times compared with conventional approaches.
Yasunori NAGATA Masao MUKAIDONO
In this paper, a fault model for multiple-valued programmable logic arrays (MV-PLAs) is proposed and the equivalences of faults of MV-PLA's are discussed. In a supposed multiple-valued NOR/TSUM PLA model, it is shown that multiple-valued stuck-at faults, multiple-valued bridging faults, multiple-valued threshold shift faults and other some faults in a literal generator circuit are equivalent or subequivalent to a multiple crosspoint fault in the NOR plane or a multiple fault of weights in the TSUM plane. These results lead the fact that multiple-valued test vector set which indicates all multiple crosspoint fault and all multiple fault of weights also detects above equivalent or subequivalent faults in a MV-PLA.