Seungyong BAEK Jingook KIM Joungho KIM
We propose an accurate and efficient model of having an unbalanced differential line structure, where mode-conversion and frequency dependent loss effects are considered in above the GHz frequency range. To extract model parameters of the proposed unbalanced differential line model, we measured s-parameters of test patterns using a 2-port VNA and defined a new type of mixed-mode s-parameter. The model parameters were obtained and are described for various types of the unbalanced differential line structures. Finally, the validity of the proposed model and the model parameters were successfully confirmed by a series of time-domain measurements and a lattice diagram analysis.
Feng CHENG Junfa MAO Xiaochun LI
A timing-driven placement algorithm based on path topology analysis is presented. The optimization for path delay is transformed into cell location optimization. The algorithm pays much attention on path topologies and applies an effective force directed method to find cell target locations. Total wire length optimization is combined with the timing-driven placement algorithm. MCNC (Microelectronics Centre of North-Carolina) standard cell benchmarks are experimented and results show that our timing-driven placement algorithm can make the longest path delay improve up to 13% compared with wirelength driven placement.
Zhi Liang WANG Osami WADA Takashi HARADA Takahiro YAGUCHI Yoshitaka TOYOTA Ryuji KOGA
Power bus noise problem has become a major concern for both EMC engineers and board designers. A fast algorithm, based on the cavity-mode model, was employed for analyzing resonance characteristics of multilayer power bus stacks interconnected by vias. The via is modeled as an inductance and its value is given by a simple expression. Good agreement between the simulated results and measurements demonstrates the effectiveness of the cavity-mode model, together with the via model.
Flavio CANAVERO Stefano GRIVET-TALOCIA Ivan A. MAIO Igor S. STIEVANO
This paper presents a systematic methodology for the system-level assessment of signal integrity and electromagnetic compatibility effects in high-speed communication and information systems. The proposed modeling strategy is illustrated via a case study consisting of a critical coupled net of a complex system. Three main methodologies are employed for the construction of accurate and efficient macromodels for each of the sub-structures typically found along the signal propagation paths, i.e. drivers/receivers, transmission-line interconnects, and interconnects with a complex 3D geometry such as vias and connectors. The resulting macromodels are cast in a common form, enabling the use of either SPICE-like circuit solvers or VHDL-AMS equation-based solvers for system-level EMC predictions.
Lakshmi K. VAKATI Kishore K. MUCHHERLA Janet M. WANG
The scaled down feature size and the increased frequency of today's deep sub-micron region call for fundamental changes in driver-load models. To be more specific, new driver-load models need to take into consideration the nonlinear behavior of the drivers, the inductance effects of the loads, and the slew rates of the output waveforms. Current driver-load models use the conventional single Ceff (one-ramp) approach and treat the interconnect load as lumped RC networks. Neither the nonlinear property nor the inductance effects were considered. The accuracy of these existing models is therefore questionable. This paper introduces a new multi-ramp driver model that represents the interconnect load as a distributed RLC network. The employed two effective capacitance values capture the nonlinear behavior of the driver. The lossy transmission line approach accounts for the impact of inductance when modeling the driving point interconnect load. The new model shows improvements of 9% in the average delay error and 2.2% in the slew rate error compared to SPICE.
Herng-Jer LEE Chia-Chi CHU Ming-Hong LAI Wu-Shiung FENG
A method is proposed to compute moments of distributed coupled RLC interconnects. Both uniform line models and non-uniform line models will be developed. Considering both self inductances and mutual inductances in multi-conductors, recursive moment computations formulae of lumped coupled RLC interconnects are extended to those of distributed coupled RLC interconnects. By using the moment computation technique in conjunction with the projection-based order reduction method, the inductive crosstalk noise waveform can be accurately and efficiently estimated. Fundamental developments of the proposed approach will be described. Simulation results demonstrate the improved accuracy of the proposed method over the traditional lumped methods.
This paper describes a high-speed D/A converter design for mixed-mode systems. Capacitive coupling induced by inter-chip interconnects and time-variant clock skew between ICs should be considered for mixed-mode systems, and on-chip interconnects should be treated as transmission lines in the circuit simulation as operating speed reaches GHz range. A robust FIFO built in the D/A converter can absorb input data timing variance due to the capacitive coupling and the clock timing skew, the worst-case margin of which is 1.5TCLK. Distributed RLC transmission line models for on-chip interconnects produce accurate simulation results at 1 GHz clock frequency over lumped models. For optimized D/A converter design, behavioral modeling methodology is also presented in this paper. Measurement results verify the accuracy of the on-chip interconnect and behavioral models.
Hideki SHIMA Toshimasa MATSUOKA Kenji TANIGUCHI
A new inductance extraction technique of spiral inductor from measurement fixture is presented. We propose a scalable expression of parasitic inductance for interconnects, and design consideration of test structure accommodating spiral inductor. The simple expression includes mutual inductance between the interconnects with high accuracy. The formula matches a commercial field solver inductance values within 1.4%. The layout of the test structure to reduce magnetic coupling between the spiral and the interconnects allows us to extract the intrinsic inductance of spiral more accurately. The proposed technique requires neither special fixture used for measurement-based method nor skilled worker for precise extraction with the analytical technique used.
Jong Wook KWAK Hyong Jin BAN Chu Shik JHON
In this letter, we propose "Torus Ring", which is a modified version of 2-level hierarchical ring. The Torus Ring has the same complexity as the hierarchical rings, since the only difference is the way it connects the local rings. It has an advantage over the hierarchical ring when the destination of a packet is the adjacent local ring, especially to the backward direction. Although we assume that the destination of a network packet is uniformly distributed across the processing nodes, the average number of hops in Torus Ring is equal to that of the hierarchical ring. However, the performance gain of the Torus Ring is expected to increase, due to the spatial locality of the application programs in the real parallel programming environment. In the simulation results, latencies of the interconnection network are reduced by up to 19%, with moderate ring utilization ratios.
Akira TSUCHIYA Masanori HASHIMOTO Hidetoshi ONODERA
This paper discusses performance limitation of on-chip interconnects. On-chip global interconnects are considered to be a bottleneck of high-performance LSIs. To overcome this issue, high-speed signaling and large throughput interconnection using electrical wires have been studied. However the limitation of on-chip interconnects has not been examined sufficiently. This paper reveals the maximum performance of on-chip global interconnects based on derived analytic expressions and detailed circuit simulation. We derive trade-off curves among bit rate, interconnect length, and eye opening both for single-end and for differential signaling. The results show that differential signaling improves signaling performance several times compared with conventional single-end signaling, and demonstrate that 80 Gbps differential signaling on 10 mm interconnects is promising.
Chia-Chi CHU Herng-Jer LEE Wu-Shiung FENG
Projection-based model reductions become a necessity for efficient interconnect modeling and simulations. In order to choose the order of the reduced system that can really reflect the essential dynamics of the original interconnect, the residual error of the transfer function can be considered as a stopping criteria to terminate the Arnoldi iteration process. Analytical expressions of this residual error are derived in detail. Furthermore, it can be found that the approximate transfer function can also be expressed as the original interconnect model with some additive perturbations. The perturbation matrix only involves resultant vectors at the previous step of the Arnoldi algorithm. These error information will provide a guideline for the order selection scheme used in the Krylov subspace model-order algorithm.
Yoshihito HASHIMOTO Shinichi YOROZU Yoshio KAMEDA Akira FUJIMAKI Hirotaka TERAI Nobuyuki YOSHIKAWA
To enable the use of passive transmission lines (PTLs) for the interconnection of single-flux-quantum (SFQ) circuits, we have implemented a driver and a receiver and have developed a method for designing SFQ circuits with passive interconnections. Basic components and properties of passive interconnections, such as the frequency characteristics of the driver and receiver, the PTL delay, and the crosstalk between PTLs, have been experimentally verified. Our developed components and design method have been applied to actual SFQ circuits, such as a 44 switch having block-to-block passive interconnections and a 22 switch having gate-to-gate passive interconnections. We have also shown the advantages of PTLs over Josephson transmission lines (JTLs). We also discuss the prospects of SFQ circuits having passive interconnections.
Xiren WANG Deyan LIU Wenjian YU Zeyi WANG
Efficient extraction of interconnect parasitic parameters has become very important for present deep submicron designs. In this paper, the improved boundary element method (BEM) is presented for 3-D interconnect resistance extraction. The BEM is accelerated by the recently proposed quasi-multiple medium (QMM) technology, which quasi-cuts the calculated region to enlarge the sparsity of the overall coefficient matrix to solve. An un-average quasi-cutting scheme for QMM, advanced nonuniform element partition and technique of employing the linear element for some special surfaces are proposed. These improvements considerably condense the computational resource of the QMM-based BEM without loss of accuracy. Experiments on actual layout cases show that the presented method is several hundred to several thousand times faster than the well-known commercial software Raphael, while preserving the high accuracy.
Junji KAWATA Yuichi TANJI Yoshifumi NISHIO Akio USHIDA
In this paper, we propose a new algorithm for calculating the exact poles of the admittance matrix of RLCG interconnects. After choosing dominant poles and corresponding residues, each element of the exact admittance matrix is approximated by partial fraction. A procedure to obtain the residues that guarantee the passivity is also provided, based on experimental studies. In the procedure the residues are calculated by using the least squares method so that the partial fraction matches each element of the exact admittance matrix in the frequency-domain. From the partial fraction representation, the asymptotic equivalent circuit models which can be easily simulated with SPICE are synthesized. It is shown that an efficient model-order reduction is possible for short-length interconnects.
Michihiro KOIBUCHI Akiya JOURAKU Hideharu AMANO
Adaptive routing algorithms, which dynamically select the route of a packet, have been widely studied for interconnection networks in massively parallel computers. An output selection function (OSF), which decides the output channel when some legal channels are free, is essential for an adaptive routing. In this paper, we propose a simple and efficient OSF called minimal multiplexed and least-recently-used (MMLRU). The MMLRU selection function has the following simple strategies for distributing the traffic: 1) each router locally grasps the congestion information by the utilization ratio of its own physical channels; 2) it is divided into the two selection steps, the choice from available physical channels and the choice from available virtual channels. The MMLRU selection function can be used on any type of network topology and adaptive routing algorithm. Simulation results show that the MMLRU selection function improves throughput and latency especially when the number of dimension becomes larger or the number of nodes per dimension become larger.
Kan TAKEUCHI Kazumasa YANAGISAWA Kazuko SAKAMOTO Teruya TANAKA
The optimum tier architectures for ASICs are investigated by using a methodology for predicting packing efficiency of a logic block (the ratio of total cell area to the block area including space regions between cells). In the methodology based on Rent's rule, (1) the empirical parameters required for the prediction are derived from the results of our ASIC products. (2) The concept of logic distance, which is expressed in units of the number of cells rather than the absolute net length, is introduced. (3) Not only performance constraints but also reliability constraints are incorporated. These allow us to make a quantitative comparison of the packing efficiency between various cell and tier structures. It is found that, for mega-cell blocks, all minimum-pitch layer architecture with buffer insertion is expected to give more than 20% reduction in block areas compared to the minimum-pitch + bi-pitch architecture, while satisfying the performance and reliability constraints.
Hideyuki FURUYA Sungwoo CHA Yoshiyuki SHIMIZU Masaki HARUOKA Toshimasa MATSUOKA Kenji TANIGUCHI
A demodulator for short-range wireless interconnect using ASK/CDMA technique has been developed with 0.25 µm CMOS technology. The fabricated demodulator demonstrates the demodulation of 7.35 Mbps bit rate with 31 spread spectrum code length at 10 GHz carrier frequency.
Sung Woo CHUNG Hyong-Shik KIM Chu Shik JHON
In CC-NUMA multiprocessor systems, it is important to reduce the remote memory access time. Based upon the fact that increasing the size of the LRU second-level (L2) cache more than a certain value does not reduce the cache miss rate significantly, in this paper, we propose two split L2 caches to utilize the surplus of the L2 cache. The split L2 caches are composed of a traditional LRU cache and another cache to reduce the remote memory access time. Both work together to reduce total L2 cache miss time by keeping remote (or long-distance) blocks as well as recently used blocks. For another cache, we propose two alternatives: an L2-RVC (Level 2 - Remote Victim Cache) and an L2-DAVC (Level 2 - Distance-Aware Victim Cache). The proposed split L2 caches reduce total execution time by up to 27%. It is also found that the proposed split L2 caches outperform the traditional single LRU cache of double size.
Tran CONG SO Shigeru OYANAGI Katsuhiro YAMAZAKI
We have proposed a speculative selection function for adaptive routing, which uses idle cycles of the network physical links to exchange network information between nodes, thus helps to decide the best selection. Previous study on the mesh network showed that SSR gives message selection flexibility that improves network performance by balancing the network traffic in both global and local scopes. This paper evaluates the speculative selection function on 2D torus network with simulation. The simulation compares the network throughput and latency with various traffic patterns. The visualization graphs show how the speculative selection eliminates hotspots and disperses traffic in the global scope. The simulation results demonstrate that by using speculative selection, the network performance is increased by around 7%. Compared to the mesh network, the torus's version has smaller gain due to the high performance nature of the torus network.
This paper proposes several novel hierarchical interconnection networks based on the (3, 3)-graphs, namely folded (3, 3)-networks, root-folded (3, 3)-networks, recursively expanded (3, 3)-networks, and flooded (3, 3)-networks. Just as the hypercubes, CCC, Peterson-based networks, and Heawood-based networks, these hierarchical networks have the following nice properties: regular topology, high scalability, and small diameters. Due to these important properties, these hierarchical networks seem to have the potential as alternatives for the future interconnection structures of multicomputer systems, especially massively parallel processors (MPPs). Furthermore, this paper will present the routing and broadcasting algorithms for these proposed networks to demonstrate that these algorithms are as elegant as the algorithms for hypercubes, CCC, and Petersen- or Heawood-based networks.