1-19hit |
Liangpeng GUO Yici CAI Qiang ZHOU Xianlong HONG
Multiple supply voltage (MSV) is an effective scheme to achieve low power. Recent works in MSV are based on physical level and aim at reducing physical overheads, but all of them do not consider level converter, which is one of the most important issues in dual-vdd design. In this work, a logic and layout aware methodology and related algorithms combining voltage assignment and placement are proposed to minimize the number of level converters and to implement voltage islands with minimal physical overheads. Experimental results show that our approach uses much fewer level converters (reduced by 83.23% on average) and improves the power savings by 16% on average compared to the previous approach [1]. Furthermore, the methodology is able to produce feasible placement with a small impact to traditional placement goals.
Sheqin DONG Xianlong HONG Song CHEN Xin QI Ruijie WANG Jun GU
Solution space smoothing allows a local search heuristic to escape from a poor, local minimum. In this paper, we propose a technique that can smooth the rugged terrain surface of the solution space of a placement problem. We test the smoothing heuristics for MCNC benchmarks, and for VLSI placement with pre-placed modules and placement with consideration of congestion. Experiment results demonstrated that solution space smoothing is very efficient for VLSI module placement, and it can be applied to all floorplanning representations proposed so far.
Jingyu XU Xianlong HONG Tong JING
Timing optimization is an important goal of global routing in deep submicron era. To guarantee the timing performance of the circuit, merely adopting topology optimization becomes inadequate. In this paper, we present an efficient timing-driven global routing algorithm with buffer insertion. Our approach is capable of applying topological-based timing optimization and buffer insertion simultaneously with routablity considerations. Compared with previous works, we efficiently solve the timing issues under a limited buffer usage. The experimental results have demonstrated significant delay improvement within short runtime with very small number of buffers inserted.
Shan ZENG Wenjian YU Jin SHI Xianlong HONG Chung-Kuan CHENG
Inductive effect becomes important for on-chip global interconnects, like the power/ground (P/G) grid. Because of the locality property of partial reluctance, the inverse of partial inductance, the window-based partial reluctance extraction has been applied for large-scale interconnect structures. In this paper, an efficient method of partial reluctance extraction is proposed for large-scale regular P/G grid structures. With a block reuse technique, the proposed method makes full use of the structural regularity of the P/G grid. Numerical results demonstrate the proposed method is able to efficiently handle a P/G grid with up to one hundred thousands wire segments. It is several tens times faster than the window-based method, while generating accurate frequency-dependent partial reluctance and resistance.
Weixiang SHEN Yici CAI Xianlong HONG Jiang HU
As power consumption of the clock tree dominates over 40% of the total power in modern high performance VLSI designs, measures must be taken to keep it under control. One of the most effective methods is based on clock gating to shut off the clock when the modules are idle. However, previous works on gated clock tree power minimization are mostly focused on clock routing and the improvements are often limited by the given registers placement. The purpose of this work is to navigate the registers during placement to further reduce the clock tree power based on clock gating. Our method performs activity-aware register clustering that reduces the clock tree power not only by clumping the registers into a smaller area, but also by pulling the registers with the similar activity patterns closely to shut off the clock more time for the resultant subtrees. In order to reduce the impact of signal nets wirelength and power due to register clustering, we apply the timing and activity based net weighting in [14], which reduces the nets switching power by assigning a combination of activity and timing weights to the nets with higher switching rates or more critical timing. To tradeoff the power dissipated by the clock tree and the control signal, we extend the idea of local ungating in [6] and propose an algorithm of gate control signal optimization, which still sets the gate enable signal high if a register is active for a number of consecutive clock cycles. Experimental results on a set of MCNC benchmarks show that our approach is able to reduce the power and total wirelength of clock tree greatly with minimal overheads.
Kan WANG Sheqin DONG Yuchun MA Yu WANG Xianlong HONG Jason CONG
Due to the increased power density and lower thermal conductivity, 3D ICs are faced with heat dissipation and temperature problem seriously. TSV (Through-Silicon-Via) has been shown as an effective way to help heat removal, but they introduce several issues related with cost and reliability as well. Previous researches of TSV planning didn't pay much attention to the impact of leakage power, which will bring in error on estimation of temperature, TSV number and also critical path delay. The leakage-temperature-delay dependence can potentially negate the performance improvement of 3D designs. In this paper, we analyze the impact of leakage power on TSV planning and integrate leakage-temperature-delay dependence into thermal via planning of 3D ICs. A weighted via insertion approach, considering the influence on both module delay and wire delay, is proposed to achieve the best balance among temperature, via number and performance. Experiment results show that, with leakage power and resource constraint considered, temperature and the required via number can be quite different, and the weighted TSV insertion approach with iterative process can obtain the trade-off between different factors including thermal, power consumption, via number and performance.
Yanming JIA Yici CAI Xianlong HONG
This paper studies the impact of dummy fill for chemical mechanical polishing (CMP)-induced capacitance variation on buffer insertion based on a virtual CMP fill estimation model. Compared with existing methods, our algorithm is more feasible by performing buffer insertion not in post-process but during early physical design. Our contributions are threefold. First, we introduce an improved fast dummy fill amount estimation algorithm based on [4], and use some speedup techniques (tile merging, fill factor and amount assigning) for early estimation. Second, based on some reasonable assumptions, we present an optimum virtual dummy fill method to estimate dummy position and the effect on the interconnect capacitance. Then the dummy fill estimation model was verified by our experiments. Third, we use this model in early buffer insertion after layer assignment considering the effects of dummy fill. Experimental results verified the necessity of early dummy fill estimation and the validity of our algorithm. Buffer insertion considering dummy fill during early physical design is necessary and our algorithm is promising.
Yuchun MA Xin LI Yu WANG Xianlong HONG
In 3D IC design, thermal issue is a critical challenge. To eliminate hotspots, physical layouts are always adjusted by some incremental changes, such as shifting or duplicating hot blocks. In this paper, we distinguish the thermal-aware incremental changes in three different categories: migrating computation, growing unit and moving hotspot blocks. However, these modifications may degrade the packing area as well as interconnect distribution greatly. In this paper, mixed integer linear programming (MILP) models are devised according to these different incremental changes so that multiple objectives can be optimized simultaneously. Furthermore, to avoid random incremental modification, which may be inefficient and need long runtime to converge, here potential gain is modeled for each candidate incremental change. Based on the potential gain, a novel thermal optimization flow to intelligently choose the best incremental operation is presented. Experimental results show that migrating computation, growing unit and moving hotspot can reduce max on-chip temperature by 7%, 13% and 15% respectively on MCNC/GSRC benchmarks. Still, experimental results also show that the thermal optimization flow can reduce max on-chip temperature by 14% to the initial packings generated by an existing 3D floorplanning tool CBA, and achieve better area and total wirelength improvement than individual operations do. The results with the initial packings from CBA_T (Thermal-aware CBA floorplanner) show that 13.5% temperature reduction can be obtained by our incremental optimization flow.
Jingjing FU Zuying LUO Xianlong HONG Yici CAI Sheldon X.-D. TAN Zhu PAN
In this paper, we present an efficient method to budget on-chip decoupling capacitors (decaps) to optimize power delivery networks in an area efficient way. Our algorithm is based on an efficient gradient-based non-linear programming method for searching the solution. Our contributions are an efficient gradient computation method (time-domain merged adjoint network method) and a novel equivalent circuit modeling technique to speed up the optimization process. Experimental results demonstrate that the algorithm is capable of efficiently optimizing very large scale P/G networks.
Yibo WANG Yici CAI Xianlong HONG Yi ZOU
Buffer insertion plays a great role in modern global interconnect optimization. But too many buffers exhaust routing resources, and result in the rise of the power dissipation. Unfortunately, simplified delay models used by most of the present buffer insertion algorithms may introduce redundant buffers due to the delay estimation errors, whereas accurate delay models expand the solution space significantly, resulting in unacceptable runtime. Moreover, the power dissipation problem becomes a dominant factor in the state-of-the-art IC design. Not only transistor but also interconnect should be taken into consideration in the power calculation, which makes us have to use an accurate power model to calculate the total power dissipation. In this paper, we present two stochastic optimization methods, simulated annealing and solution space smoothing, which use accurate delay and power models to construct buffered routing trees with considerations of buffer/wire sizing, routing obstacles and delay and power optimization. Experimental results show our methods can save much of the buffer area and the power dissipation with better solutions, and for the cases with pins ≤ 15, the runtime of solution space smoothing is tens of times faster.
Bin LIU Yici CAI Qiang ZHOU Xianlong HONG
In VDSM era, crosstalk is becoming a more and more vital factor in high performance VLSI designs, making noise mitigation in early design stages necessary. In this paper, we propose an effective algorithm optimizing crosstalk under congestion constraint in the layer assignment stage. A new model for noise severity measurement is developed where wire length is used as a scale for the noise immunity, and both capacitive and inductive coupling between sensitive nets are considered. We also take shield insertion into account for further crosstalk mitigation. Experimental results show that our approach could efficiently reduce crosstalk noise without compromising congestion compared to the algorithm proposed in [1].
Weikun GUO Sheldon X.-D. TAN Zuying LUO Xianlong HONG
This paper proposes a new simulation algorithm for analyzing large power distribution networks, modeled as linear RLC circuits, based on a novel partial random walk concept. The random walk simulation method has been shown to be an efficient way to solve for voltages of small number of nodes in a large power distribution network, but the algorithm becomes expensive to solve for voltages of nodes that are more than a few with high accuracy. In this paper, we combine direct methods like LU factorization with the random walk concept to solve power distribution networks when voltage waveforms from a large number of nodes are required. We extend the random walk algorithm to deal with general RLC networks and show that Norton companion models for capacitors and self-inductors are more amenable for transient analysis by using random walks than Thevenin companion models. We also show that by nodal analysis (NA) formulation for all the voltage sources, LU-based direct simulations of subcircuits can be speeded up. Experimental results demonstrate that the resulting algorithm, called partial random walk (PRW), has significant advantages over the existing random walk method especially when the VDD/GND nodes are sparse and accuracy requirement is high.
Shan ZENG Wenjian YU Xianlong HONG Chung-Kuan CHENG
In this paper, an efficient method is proposed to accurately analyze large-scale power/ground (P/G) networks, where inductive parasitics are modeled with the partial reluctance. The method is based on frequency-domain circuit analysis and the technique of vector fitting, and obtains the time-domain voltage response at given P/G nodes. The frequency-domain circuit equation including partial reluctances is derived, and then solved with the GMRES algorithm with rescaling, preconditioning and recycling techniques. With the merit of sparsified reluctance matrix and iterative solving techniques for the frequency-domain circuit equations, the proposed method is able to handle large-scale P/G networks with complete inductive modeling. Numerical results show that the proposed method is orders of magnitude faster than HSPICE, several times faster than INDUCTWISE, and capable of handling the inductive P/G structures with more than 100,000 wire segments.
Xiaoyi WANG Jin SHI Yici CAI Xianlong HONG
It's a trend to consider the power supply integrity at early stage to improve the design quality. Specifically, floorplanning process is modified to improve the power supply as well. In the modified floorplanning process, both the floorplan and power/ground (P/G) network are adjusted to search for optimal floorplan as well as the most robust power supply. In this paper, we propose a novel algorithm to carry out this modified floorplanning. A new analytical method is proposed to estimate the voltage drop while the floorplan is varying constantly. This fast analytical voltage drop estimating method is plugged into the modified floorplanner to speed up the whole floorplanning process. Compared with previous methods, our algorithm can search for the optimal floorplan with consideration of power supply integrity more efficiently and therefore leads to better results. Furthermore, this paper also proposes a novel heuristic method to optimize the topology of P/G network. This optimization algorithm could construct a more robust power supply system. Experimental results show the method can speedup the IR-drop aware floorplanning process by about 10 times and reduce the routing area of P/G network while maintaining the floorplan quality and power supply integrity.
Yici CAI Bin LIU Qiang ZHOU Xianlong HONG
The voltage island style has been widely accepted as an effective way to design low power high performance chips. This paper proposes an automated voltage island generation flow in standard cell based designs. Two important objectives in voltage island designs are addressed in this flow: 1) reducing power dissipation under given performance constraints; 2) reducing implementation overheads, mainly layout overheads caused by cell clustering to form islands. The first objective is handled with timing and power driven netweighting and timing analysis in voltage assignment. For the second objective, we propose layout aware voltage assignment, i.e., voltage assignment during placement. We iteratively perform the following to adjustments: adjustment on voltage assignment to facilitate voltage island generation, and adjustment on cell locations to cluster cells in voltage islands. These iterations lead to a flow featured with tightly integrated voltage assignment and cell placement. Experimental results have demonstrated the advantages of our approach.
Yuchun MA Xianlong HONG Sheqin DONG Yici CAI Chung-Kuan CHENG Jun GU
Boundary Constraints of VLSI floorplanning require a set of blocks to be placed along the boundaries of the chip. Thus, this set of blocks can be adjacent to I/O pads for external communication. Furthermore, these blocks are kept away from the central area so that they do not form blockage for internal routing. In the paper, we devise an algorithm of VLSI floorplanning with boundary constraints using a Corner Block List (CBL) representation. We identify the necessary and sufficient conditions of the CBL representation for the boundary constraints. We design a linear time approach to scan the conditions and formulate a penalty function to punish the constraint violation. A simulated annealing process is adopted to optimize the floorplan. Experiments on MCNC benchmarks show promising results.
Yi ZOU Yici CAI Qiang ZHOU Xianlong HONG Sheldon X.-D. TAN
This paper presents a novel approach to reducing the complexity of the transient linear circuit analysis for a hybrid structured clock network. Topology reduction is first used to reduce the complexity of the circuits and a preconditioned Krylov-subspace iterative method is then used to perform the nodal analysis on the reduced circuits. By proper selection of the simulation time step and interval based on Elmore delays, the delay of the clock signal between the clock source and the sink node as well as the clock skews between the sink nodes can be computed efficiently and accurately. Our experimental results show that the proposed algorithm is two orders of magnitude faster than HSPICE without loss of accuracy and stability. The maximum error is within 0.4% of the exact delay time.
Jingyu XU Xianlong HONG Tong JING Yici CAI Jun GU
As the CMOS technology enters the very deep submicron era, inter-wire coupling capacitance becomes the dominant part of load capacitance. The coupling effects have brought new challenges to routing algorithms on both delay estimation and optimization. In this paper, we propose a timing-driven global routing algorithm with consideration of coupling effects. Our two-phase algorithm based on timing-relax method includes a heuristic Steiner tree algorithm to guarantee the timing performance of the initial solution and an optimization algorithm based on coupling-effect-transference. Experimental results are given to demonstrate the efficiency and accuracy of the algorithm.
Yongqiang LU Chin-Ngai SZE Xianlong HONG Qiang ZHOU Yici CAI Liang HUANG Jiang HU
With VLSI design development, the increasingly severe power problem requests to minimize clock routing wirelength so that both power consumption and power supply noise can be alleviated. In contrast to most of traditional works that handle this problem only in clock routing, we propose to navigate standard cell register placement to locations that enable further less clock routing wirelength and power. To minimize adverse impacts to conventional cell placement goals such as signal net wirelength and critical path delay, the register placement is carried out in the context of a quadratic placement. The proposed technique is particularly effective for the recently popular prescribed skew clock routing. Experiments on benchmark circuits show encouraging results.