The search functionality is under construction.

IEICE TRANSACTIONS on Fundamentals

  • Impact Factor

    0.48

  • Eigenfactor

    0.003

  • article influence

    0.1

  • Cite Score

    1.1

Advance publication (published online immediately after acceptance)

Volume E87-A No.12  (Publication Date:2004/12/01)

    Special Section on VLSI Design and CAD Algorithms
  • FOREWORD

    Michiaki MURAOKA  

     
    FOREWORD

      Page(s):
    3029-3029
  • RTOS-Centric Cosimulator for Embedded System Design

    Shinya HONDA  Takayuki WAKABAYASHI  Hiroyuki TOMIYAMA  Hiroaki TAKADA  

     
    PAPER-System Level Design

      Page(s):
    3030-3035

    With the growing design complexity of contemporary embedded systems, real-time operating systems (RTOSs) have become one of important components of such complex embedded systems. This paper presents an RTOS-centric hardware/software cosimulator which we have developed for embedded system design. One of the most remarkable features in our cosimulator is that it has a complete simulation model of an RTOS which is widely used in industry, so that application tasks including RTOS service calls are natively executed on a host computer. Our cosimulator also features cosimulation with functional simulation models of hardware written in C/C++ and cosimulation with HDL simulators. A case study with a JPEG decoder application demonstrates the effectiveness of our cosimulator.

  • FPGA-Based Reconfigurable Adaptive FEC

    Kazunori SHIMIZU  Jumpei UCHIDA  Yuichiro MIYAOKA  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-System Level Design

      Page(s):
    3036-3046

    In this paper, we propose a reconfigurable adaptive FEC system. In adaptive FEC schemes, the error correction capability t is changed dynamically according to the communication channel condition. If a particular error correction capability t is given, we can implement an FEC decoder which is optimal for t by taking the number of operations into consideration. Thus, reconfiguring the optimal FEC decoder dynamically for each error correction capability allows us to maximize the throughput of each decoder within a limited hardware resource. Based on this concept, our reconfigurable adaptive FEC system can reduce the packet dropping rate more efficiently than conventional fixed hardware systems. We can improve data transmission throughput for a reliable transport protocol. Practical simulation results are also shown.

  • An IP Synthesizer for Limited-Resource DWT Processor

    Lan-Rong DUNG  

     
    PAPER-System Level Design

      Page(s):
    3047-3056

    This paper presents a VLSI design methodology for the MAC-level DWT/IDWT processor based on a novel limited-resource scheduling algorithm. The r-split Fully-specified Signal Flow Graph (FSFG) of limited-resource FIR filtering has been developed for the scheduling of the MAC-level DWT/IDWT signal processing. Given a set of architecture constraints and DWT parameters, the scheduling algorithm can generate four scheduling matrices that drive the data path to perform the DWT computation. Because the memory for the inter-octave is considered with the register of FIR filter, the memory size is less than the traditional architecture. Besides, based on the limited-resource scheduling algorithm, an automated DWT processor synthesizer has been developed and generates constrained DWT processors in the form of silicon intelligent property (SIP). The DWT SIP can be embedded into a SOC or mapped to program codes for commercial off-the-shelf (COTS) DSP processors with programmable devices. As a result, it has been successfully proven that a variety of DWT SIPs can be efficiently realized by tuning the parameters and applied for signal processing applications.

  • SoC Architecture Synthesis Methodology Based on High-Level IPs

    Michiaki MURAOKA  Hiroaki NISHI  Rafael K. MORIZAWA  Hideaki YOKOTA  Yoichi ONISHI  

     
    PAPER-System Level Design

      Page(s):
    3057-3067

    We propose a sophisticated synthesis methodology for SoC (System-on-Chip) architectures from the system level specification based on reusable high-level IPs named as Virtual Cores (VCores), in this paper. This synthesis methodology generates an initial architecture that consists of a CPU, buses, IPs, peripherals, I/Os and an RTOS (Real Time Operating System), as well as making tradeoffs to the architecture, between hardware and software on assigned software VCores and hardware VCores. The results of an architecture level design experiment, using the proposed methodology, shows that the partial automation of the architecture synthesis process, allied with design reuse, accelerates the architecture design, therefore, reducing the time required to design an architecture of SoC.

  • An Embedded Processor Core for Consumer Appliances with 2.8GFLOPS and 36 M Polygons/s FPU

    Fumio ARAKAWA  Motokazu OZAWA  Osamu NISHII  Toshihiro HATTORI  Takeshi YOSHINAGA  Tomoichi HAYASHI  Yoshikazu KIYOSHIGE  Takashi OKADA  Masakazu NISHIBORI  Tomoyuki KODAMA  Tatsuya KAMEI  Makoto ISHIKAWA  

     
    PAPER-System Level Design

      Page(s):
    3068-3074

    A SuperHTM embedded processor core implemented in a 130-nm CMOS process running at 400 MHz achieved 720 MIPS and 2.8 GFLOPS at a power of 250 mW in worst-case conditions. It has a dual-issue seven-stage pipeline architecture but maintains the 1.8 MIPS/MHz of the previous five-stage processor. The processor meets the requirements of a wide range of applications, and is suitable for digital appliances aimed at the consumer market, such as cellular phones, digital still/video cameras, and car navigation systems.

  • High-Level Power Optimization Based on Thread Partitioning

    Jumpei UCHIDA  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-System Level Design

      Page(s):
    3075-3082

    This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have RF. The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.

  • Coupling-Driven Data Bus Encoding for SoC Video Architectures

    Luca FANUCCI  Riccardo LOCATELLI  Andrea MINGHI  

     
    PAPER-System Level Design

      Page(s):
    3083-3090

    This paper presents the definition and implementation design of a low power data bus encoding scheme dedicated to system on chip video architectures. Trends in CMOS technologies focus the attention on the energy consumption issue related to on-chip global communication; this is especially true for data dominated applications such as video processing. Taking into account scaling effects a novel coupling-aware bus power model is used to investigate the statistical properties of video data collected in the system bus of a reference hardware/software H.263/MPEG-4 video coder architecture. The results of this analysis and the low complexity requirements drive the definition of a bus encoding scheme called CDSPBI (Coupling Driven Separated Partial Bus Invert), optimized ad-hoc for video data. A VLSI implementation of the coding circuits completes the work with an area/delay/power characterization that shows the effectiveness of the proposed scheme in terms of global power saving for a small circuit area overhead.

  • Power Modeling of Synthesizable Soft Macros

    Kyung Tae DO  Yang Hyo KIM  Young Hwan KIM  Jung Yun CHOI  

     
    PAPER-System Level Design

      Page(s):
    3091-3099

    We present a new approach to the power modeling of synthesizable soft macros, which uses the characteristics of individual input signals for high accuracy. We also present the parameterized power model, developed using the proposed approach, which can relieve us from the power characterization for all possible macro sizes. Extensive experiments illustrate that the proposed approaches exhibit the overall modeling errors below 4.24% and 4.71% for benchmark macros before and after parameterization, when compared with the results of gate-level analysis.

  • On Multiple-Voltage High-Level Synthesis Using Algorithmic Transformations

    Lan-Rong DUNG  Hsueh-Chih YANG  

     
    PAPER-Logic Synthesis

      Page(s):
    3100-3108

    This paper presents a multiple-voltage high-level synthesis approach for low power DSP applications using algorithmic transformation techniques. Our approach is motivated by maximization of task mobilities in that the increase of mobilities may raise the possibility of assigning tasks to low-voltage components. The mobility means the ability to schedule the starting time of a task. It is defined as the distance between its as-late-as-possible (ALAP) schedule time and its as-soon-as-possible (ASAP) schedule time. To earn task mobilities, we use loop shrinking, retiming and unfolding techniques. The loop shrinking can first reduce the iteration period bound (IPB) and, then, the others are employed for shortening the iteration period (IP) as much as possible. The minimization of IP results in high task mobilities. Finally, we can assign tasks with high mobilities to low-voltage components and, thus, minimize energy under resource and latency constraints. With considering the overhead of level conversion, our approach can achieve significant power reduction. In the case of the third-order IIR filter, the proposed approach can save up to 40.2% of power consumption.

  • A Low-Power Architecture for Extended Finite State Machines Using Input Gating

    Shi-Yu HUANG  Chien-Jyh LIU  

     
    PAPER-Logic Synthesis

      Page(s):
    3109-3115

    In this paper, we investigate a low-power architecture for designs modeled as an Extended Finite State Machine (EFSM). It is based on the general dynamic power management concept, in which the redundant computation can be dynamically disabled to reduce the overall power dissipation. The contribution of this paper is mainly a systematic procedure to identify almost maximal amount of redundant computation in a design given as an EFSM. There are two levels of redundant computation to be exploited--one is based on the machine state information, while the other is based on the transition information. After the extraction of the redundant computation, a low-power architecture using input gating is proposed to synthesize the final circuit. We tested the technique on a design computing a number's modulo inverse. Experimental results show that 31% power reduction can be achieved at the costs of 2% timing penalty and 16% area overhead.

  • Dynamic Sleep Control for Finite-State-Machines to Reduce Active Leakage Power

    Kimiyoshi USAMI  Hiroshi YOSHIOKA  

     
    PAPER-Logic Synthesis

      Page(s):
    3116-3123

    Leakage power is predicted to become dominant in the total operation power as the transistor technology gets advanced. Even in the current technology, dramatic increase of leakage power at elevated temperature is a big problem. Burn-in testing, which is typically performed at 125, is facing at difficulties such as throughput degradation or thermal runaway due to increase of leakage power. Reducing leakage power at operation time is essential to solve these problems. We propose a novel approach to make use of an enable signal of a gated-clock technique for reducing active leakage power. A sleep transistor is provided between combinational logic circuits and the ground, and is controlled by the enable signal. When state transitions do not occur in Finite-State-Machines (FSM's), the enable signal becomes low and the state flip-flops keep the data. At the same time, the sleep transistor is turned off so that combinational logic gates are electrically disconnected from the ground to reduce leakage. Simulation results have shown that the proposed scheme reduces active leakage power by 30-60% in 0.18 µm technology. The total power was reduced by 20% at the maximum at 125. It was also found that performance degradation was tolerable for burn-in testing.

  • Super-Set of Permissible Functions and Its Application to the Transduction Method

    Katsunori TANAKA  Yahiko KAMBAYASHI  

     
    PAPER-Logic Synthesis

      Page(s):
    3124-3133

    The Transduction Method is a powerful way to design logic circuits, utilizing already existing circuits. A set of permissible functions (SPF) plays an essential role in such circuit transformation/reduction, and is computed at each point (connection or gate output). Currently, two types of SPFs have been used: the maximum SPFs (MSPFs) and compatible SPFs (CSPFs). At each point, the MSPF is literally the set of all PF's, and CSPF is a subset of the MSPF. When CSPFs are calculated, priorities are first assigned to all gates in the circuit. Based on the priorities, it is decided which subset is to be selected as the CSPF. The quality of the results depends on the priorities. In this paper, the concept of super-sets of permissible functions (SSPFs) is introduced to reduce the effect of the priorities that CSPFs depend on. In order to loosen the dependency, each SSPF is computed to contain CSPFs which are candidates to be selected. The experimental results show that the SSPF-based Transduction Method has intermediate reduction capability and takes an intermediate computation time between the MSPF-based and CSPF-based ones. The capability and the time are considered as an acceptably good trade-off. In addition, without any transformations, since SSPFs are the maximum super-set, SSPFs are applicable for analyzing the maximum performance of the CSPF-based transformation, for comparison with the MSPF-based one. Theoretically, the number of connectable gate pairs detected by the MSPFs is 100%. According to the experimental results obtained using SSPFs, on average, 99% are detectable by SSPFs and 1% are detectable only by using the MSPFs. The results show that by using CSPFs, 72% of connectable gate pairs are detectable with any priority assignment and 99% (SSPFs capability) are detectable on average even when the best priorities are assigned. According to the experimental results of CSPF calculation with five priorities, 82% to 93% are practically detectable on average. This is the first quantitative analysis realized by SSPFs which compares the CSPF-based and MSPF-based Transduction Methods with respect to the coverage of PF's.

  • Fast Boolean Matching under Permutation by Efficient Computation of Canonical Form

    Debatosh DEBNATH  Tsutomu SASAO  

     
    PAPER-Logic Synthesis

      Page(s):
    3134-3140

    Checking the equivalence of two Boolean functions under permutation of the variables is an important problem in the synthesis of multiplexer-based field-programmable gate arrays (FPGAs), and the problem is known as Boolean matching. This paper presents an efficient breadth-first search technique for computing a canonical form--namely P-representative--of Boolean functions under permutation of the variables. Two functions match if they have the same P-representative. On an ordinary workstation, on the average, the method requires several microseconds to check the Boolean matching of functions with up to eight variables against a library with tens of thousands of cells.

  • A Realization of Multiple-Output Functions by a Look-Up Table Ring

    Hui QIN  Tsutomu SASAO  Munehiro MATSUURA  Shinobu NAGAYAMA  Kazuyuki NAKAMURA  Yukihiro IGUCHI  

     
    PAPER-Logic Synthesis

      Page(s):
    3141-3150

    A look-up table (LUT) cascade is a new type of a programmable logic device (PLD) that provides an alternative way to realize multiple-output functions. An LUT ring is an emulator for an LUT cascade. Compared with an LUT cascade, the LUT ring is more flexible. In this paper we discuss the realization of multiple-output functions with the LUT ring. Unlike an FPGA realization of a logic function, accurate prediction of the delay time is easy in an LUT ring realization. A prototype of an LUT ring has been custom-designed with 0.35 µm CMOS technology. Simulation results show that the LUT ring is 80 to 241 times faster than software programs on an SH-1, and 36 to 93 times faster than software programs on a PentiumIII when the frequencies for the LUT ring and the MPUs are the same, but is slightly slower than commercial FPGAs.

  • Timing Optimization Methodology Based on Replacing Flip-Flops by Latches

    Ko YOSHIKAWA  Keisuke KANAMARU  Yasuhiko HAGIHARA  Shigeto INUI  Yuichi NAKAMURA  Takeshi YOSHIMURA  

     
    PAPER-Logic Synthesis

      Page(s):
    3151-3158

    Latch-based circuits have advantages for timing and are widely used for high-speed custom circuits. ASIC design flows, however, are based on circuits with flip-flops. This paper describes a new timing optimization algorithm by replacing the flip-flops in high-end ASICs by latches without changing the functionality of the circuits. Timing is optimized by using a fixed-phase retiming minimizing the impact of clock skew and jitter. A formal equivalence verification method that assures the logical correctness of the latch-replaced circuits is also proposed. Experimental results show that the optimization algorithm decreases the delay of benchmark circuits by as much as 17%.

  • Efficient False Aggressors Pruning with Functional Correlation

    Hyungwoo LEE  Juho KIM  

     
    PAPER-Logic Synthesis

      Page(s):
    3159-3165

    Signal integrity problem arises as one of the main issues in digital circuits manufactured by today's deep submicron technology. The coupling capacitance of neighboring lines may cause delays of circuit and it may affect the functionality of circuit. These effects are usually referred to as crosstalk. Since it requires additional design cost to fix crosstalk noise, the false aggressor nodes that cannot affect on victim node have to be eliminated. In this paper, we propose efficient heuristic algorithm that considers functional correlation for false aggressor pruning in crosstalk noise analysis. The false aggressors are detected by a path sensitization algorithm and logic implication. The efficiency of our algorithm has been verified on Benchmark circuits with a 0.18 µm standard cell library. Experimental results show an average of 5.4% false aggressor detection and an average improvement of 14.6% in the accuracy of timing analysis.

  • A Parallel Flop Synchronizer and the Handshake Interface for Bridging Asynchronous Domains

    Suk-Jin KIM  Jeong-Gun LEE  Kiseon KIM  

     
    PAPER-Logic Synthesis

      Page(s):
    3166-3173

    Inter-domain communications on a chip require a synchronizer to resolve the timing problems between an input and a clock of a destination. This paper presents a parallel flop synchronizer and its interface circuit for transferring asynchronous data to the clock domain. The proposed scheme uses a bank of independent two-flops in parallel and supports a two-phase handshake protocol. Compared to the conventional two-flop synchronizer, performance analysis shows that the proposed scheme can reduce latency up to one and a half of clock cycles while retaining its safety to a tolerable level. All designs have been implemented in a 0.25 µm CMOS technology to verify performance analysis of the proposed synchronization.

  • Test Architecture Optimization for System-on-a-Chip under Floorplanning Constraints

    Makoto SUGIHARA  Kazuaki MURAKAMI  Yusuke MATSUNAGA  

     
    PAPER-Test

      Page(s):
    3174-3184

    In this paper, a test architecture optimization for system-on-a-chip under floorplanning constraints is proposed. The models of previous test architecture optimizations were too ideal to be applied to industrial SOCs. To make matters worse, they couldn't treat topological locality of cores, that is, floorplanning constraints. The optimization proposed in this paper can avoid long wires for TAMs in consideration of floorplanning constraints and finish optimizing test architectures within reasonable computation time.

  • Efficient Block-Level Connectivity Verification Algorithms for Embedded Memories

    Jin-Fu LI  

     
    PAPER-Test

      Page(s):
    3185-3192

    A large memory is typically designed with multiple identical memory blocks for reducing delay and power. The circuit verification of individual memory blocks can be effectively handled by the Symbolic Trajectory Evaluation (STE) approach. However, if multiple memory blocks are integrated into a single system, the STE approach cannot verify it economically. This paper introduces algorithms for verifying block-level connectivity of memories. The verification time of a large memory can be reduced drastically by using bottom-up verification scheme. That is, a memory block is first verified thoroughly, and then only the interconnection between memory blocks of the large memory needs to be verified. The proposed verification algorithms require (3n+2(log2n+1)+3log2m) Read/Write operations for a 2nm-bit memory, where n and m are the address width and data width, respectively. Also, the algorithms can verify 100% of the inter-port and intra-port signal misplaced faults of the address, data input, and data output ports.

  • A Hybrid Dictionary Test Data Compression for Multiscan-Based Designs

    Youhua SHI  Shinji KIMURA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Test

      Page(s):
    3193-3199

    In this paper, we present a test data compression technique to reduce test data volume for multiscan-based designs. In our method the internal scan chains are divided into equal sized groups and two dictionaries were build to encode either an entire slice or a subset of the slice. Depending on the codeword, the decompressor may load all scan chains or may load only a group of the scan chains, which can enhance the effectiveness of dictionary-based compression. In contrast to previous dictionary coding techniques, even for the CUT with a large number of scan chains, the proposed approach can achieve satisfied reduction in test data volume with a reasonable smaller dictionary. Experimental results showed the proposed test scheme works particularly well for the large ISCAS'89 benchmarks.

  • A Design Scheme for Delay Testing of Controllers Using State Transition Information

    Tsuyoshi IWAGAKI  Satoshi OHTAKE  Hideo FUJIWARA  

     
    PAPER-Test

      Page(s):
    3200-3207

    This paper presents a non-scan design scheme to enhance delay fault testability of controllers. In this scheme, we utilize a given state transition graph (STG) to test delay faults in its synthesized controller. The original behavior of the STG is used during test application. For faults that cannot be detected by using the original behavior, we design an extra logic, called an invalid test state and transition generator, to make those faults detectable. Our scheme allows achieving short test application time and at-speed testing. We show the effectiveness of our method by experiments.

  • A Selective Scan Chain Reconfiguration through Run-Length Coding for Test Data Compression and Scan Power Reduction

    Youhua SHI  Shinji KIMURA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Test

      Page(s):
    3208-3215

    Test data volume and power consumption for scan-based designs are two major concerns in system-on-a-chip testing. However, test set compaction by filling the don't-cares will invariably increase the scan-in power dissipation for scan testing, then the goals of test data reduction and low-power scan testing appear to be conflicted. Therefore, in this paper we present a selective scan chain reconfiguration method for test data compression and scan-in power reduction. The proposed method analyzes the compatibility of the internal scan cells for a given test set and then divides the scan cells into compatible classes. After the scan chain reconfiguration a dictionary is built to indicate the run-length of each compatible class and only the scan-in data for each class should be transferred from the ATE to the CUT so as to reduce test data volume. Experimental results for the larger ISCAS'89 benchmarks show that the proposed approach overcomes the limitations of traditional run-length coding techniques, and leads to highly reduced test data volume with significant power savings during scan testing in all cases.

  • Synthesis for Testability of Synchronous Sequential Circuits with Strong-Connectivity Using Undefined States on State Transition Graph

    Soo-Hyun KIM  Ho-Yong CHOI  Kiseon KIM  Dong-Ik LEE  

     
    PAPER-Test

      Page(s):
    3216-3223

    In this paper, usage of undefined states on a State Transition Graph (STG) is addressed to obtain high fault coverage, in the area of Synthesis For Testability (SFT) of synchronous sequential circuits. Basically, a given STG could be modified by adding undefined states and distinguishable transitions so that each state might be included in one strongly-connected component as much as possible. Such modification decreases the number of redundant faults caused by the existence of unreachable states on an STG. For the modification, we propose two algorithms for both incompletely-specified STGs and completely-specified STGs, respectively. In case of incompletely-specified STGs, undefined states are added using unspecified transitions of defined states. In case of completely-specified STGs, undefined states are added by changing transitions specified on an STG while preserving state equivalence. Experimental results with MCNC benchmarks show that the number of redundant faults of gate-level circuits synthesized by our modified STGs are reduced, resulting in high fault coverage as well as short test generation time

  • Abstraction and Optimization of Consistent Floorplanning with Pillar Block Constraints

    Ning FU  Shigetoshi NAKATAKE  Yasuhiro TAKASHIMA  Yoji KAJITANI  

     
    PAPER-Floorplan

      Page(s):
    3224-3232

    The success in topdown design of recent huge system LSIs is in a seamless transfer of the information resulted from the high level design to the lower level of floorplanning. For the purpose, we introduce a new concept abstract floorplan which is included in the output of high level design. From the abstract floorplan, the pillar blocks are derived which are critical sets of blocks that are expected to determine the width and height of the chip, named the frame. Since the frame and pillar blocks are obtained in the high level stage, they are useful to keep the consistency in the low level physical design if we apply optimization regarding them as constraints. Experiments to MCNC benchmarks showed that abstract floorplanning by pillar blocks output a placement faithful to the one physically optimized block placement with respect to the chip area and the wire-length.

  • EQ-Sequences for Coding Floorplans

    Hua-An ZHAO  Chen LIU  Yoji KAJITANI  Keishi SAKANUSHI  

     
    PAPER-Floorplan

      Page(s):
    3233-3243

    A floorplan specifies the layout of modules in very large scale integration (VLSI) design, and a new code, called the EQ-sequence, for representing a floorplan is presented in this paper. The EQ-sequence is based on a Q-sequence. The EQ-sequence can preserve the adjacent relationships of rooms on a floorplan, but the Q-sequence cannot. The algorithms for encoding, moving and decoding of an EQ-sequence are introduced. With the EQ-sequence, we can check whether two modules abut each other on a floorplan. It has been proved that any floorplan of n rooms is uniquely encoded by an EQ-sequence and any EQ-sequence is uniquely decoded to a floorplan, both in O(n) time.

  • A Novel Layout Approach Using Dual Supply Voltage Technique on Body-Tied PD-SOI

    Kazuki FUKUOKA  Masaaki IIJIMA  Kenji HAMADA  Masahiro NUMA  Akira TADA  

     
    PAPER-Floorplan

      Page(s):
    3244-3250

    This paper presents a novel layout approach using dual supply voltage technique. In Placing and Routing (P&R) phase, conventional approaches for dual supply voltages need to separate low supply voltage cells from high voltage ones. Consequently its layout tends to be complex compared with single supply voltage layout. Our layout approach uses cells having two supply voltage rails. Making these cells is difficult in bulk due to increase in area by n-well isolation or in delay by negative body bias caused by sharing n-well. On the other hand, making cells with two supply voltage rails is easy in body-tied PD-SOI owing to trench isolation of each body of transistor. Since our approach for dual supply voltages offers freedom for placement as much as conventional ones for single supply voltage, exsting P&R tools can be used without special operation. Simulation results with MCNC circuits and adders show that our approach reduces power by 23% and 25%, respectively, showing almost the same delay with single supply voltage layout.

  • Crosstalk Noise Optimization by Post-Layout Transistor Sizing

    Masanori HASHIMOTO  Hidetoshi ONODERA  

     
    PAPER-Physical Design

      Page(s):
    3251-3257

    This paper proposes a post-layout transistor sizing method for crosstalk noise reduction. The proposed method downsizes the drivers of aggressor wires for noise reduction, utilizing the precise interconnect information extracted from the detail-routed layouts. We develop a transistor sizing algorithm for crosstalk noise reduction under delay constraints, and construct a crosstalk noise optimization method utilizing an analytic crosstalk noise model and a transistor sizing framework that have been developed. Our method exploits the transistor sizing framework that can vary transistor widths inside cells with interconnects unchanged. Our optimization method therefore never causes a new crosstalk noise problem, and does not need iterative layout optimization. The effectiveness of the proposed method is experimentally examined using 2 circuits. The maximum noise voltage is reduced by more than 50% without delay violation. These results show that the risk of crosstalk noise problems can be considerably reduced after detail-routing.

  • A Fast Algorithm for Crosspoint Assignment under Crosstalk Constraints with Shielding Effects

    Keiji KIDA  Xiaoke ZHU  Changwen ZHUANG  Yasuhiro TAKASHIMA  Shigetoshi NAKATAKE  

     
    PAPER-Physical Design

      Page(s):
    3258-3264

    This paper presents a novel algorithm for crosspoint assignment (CPA) that takes into consideration crosstalk noise and shielding effects in deep sub-micron design. We introduce a conditional constraint which is imposed on a sensitive net-pair to detach one net from the other or to put another insensitive net between them for shielding. We provide two algorithms which can handle the conditional constraint: One is based on an ILP, which outputs an exact optimum solution. The other is a fast heuristics whose time complexity is O(n2 log n), where n is the number of pins. In experiments, we tested these algorithms for industrial examples. The results showed that the conditional constraint for shielding released algorithms from a tight space of feasible assignments. Our heuristics ran quickly and attained near optimum solutions.

  • Partial Random Walks for Transient Analysis of Large Power Distribution Networks

    Weikun GUO  Sheldon X.-D. TAN  Zuying LUO  Xianlong HONG  

     
    PAPER-Physical Design

      Page(s):
    3265-3272

    This paper proposes a new simulation algorithm for analyzing large power distribution networks, modeled as linear RLC circuits, based on a novel partial random walk concept. The random walk simulation method has been shown to be an efficient way to solve for voltages of small number of nodes in a large power distribution network, but the algorithm becomes expensive to solve for voltages of nodes that are more than a few with high accuracy. In this paper, we combine direct methods like LU factorization with the random walk concept to solve power distribution networks when voltage waveforms from a large number of nodes are required. We extend the random walk algorithm to deal with general RLC networks and show that Norton companion models for capacitors and self-inductors are more amenable for transient analysis by using random walks than Thevenin companion models. We also show that by nodal analysis (NA) formulation for all the voltage sources, LU-based direct simulations of subcircuits can be speeded up. Experimental results demonstrate that the resulting algorithm, called partial random walk (PRW), has significant advantages over the existing random walk method especially when the VDD/GND nodes are sparse and accuracy requirement is high.

  • A Fast Decoupling Capacitor Budgeting Algorithm for Robust On-Chip Power Delivery

    Jingjing FU  Zuying LUO  Xianlong HONG  Yici CAI  Sheldon X.-D. TAN  Zhu PAN  

     
    PAPER-Physical Design

      Page(s):
    3273-3280

    In this paper, we present an efficient method to budget on-chip decoupling capacitors (decaps) to optimize power delivery networks in an area efficient way. Our algorithm is based on an efficient gradient-based non-linear programming method for searching the solution. Our contributions are an efficient gradient computation method (time-domain merged adjoint network method) and a novel equivalent circuit modeling technique to speed up the optimization process. Experimental results demonstrate that the algorithm is capable of efficiently optimizing very large scale P/G networks.

  • Applications of Tree/Link Partitioning for Moment Computations of General Lumped R(L)C Interconnect Networks with Multiple Resistor Loops

    Herng-Jer LEE  Ming-Hong LAI  Chia-Chi CHU  Wu-Shiung FENG  

     
    PAPER-Physical Design

      Page(s):
    3281-3292

    A new moment computation technique for general lumped R(L)C interconnect circuits with multiple resistor loops is proposed. Using the concept of tearing, a lumped R(L)C network can be partitioned into a spanning tree and several resistor links. The contributions of network moments from each tree and the corresponding links can be determined independently. By combining the conventional moment computation algorithms and the reduced ordered binary decision diagram (ROBDD), the proposed method can compute system moments efficiently. Experimental results have demonstrate that the proposed method can indeed obtain accurate moments and is more efficient than the conventional approach.

  • High Speed Layout Synthesis for Minimum-Width CMOS Logic Cells via Boolean Satisfiability

    Tetsuya IIZUKA  Makoto IKEDA  Kunihiro ASADA  

     
    PAPER-Physical Design

      Page(s):
    3293-3300

    This paper proposes a cell layout synthesis method via Boolean Satisfiability (SAT). Cell layout synthesis problems are first transformed into SAT problems by our formulations. Our method realizes a high-speed layout synthesis for CMOS logic cells and guarantees to generate the minimum-width cells with routability under our layout styles. It considers complementary P-/N-MOSFETs individually during transistor placement, and can generate smaller width layout compared with pairing the complementary P-/N-MOSFETs case. To demonstrate the effectiveness of our SAT-based cell synthesis, we present experimental results which compare it with the 0-1 ILP-based transistor placement method and a commercial cell generation tool. The experimental results show that our SAT-based method can generate minimum-width placements in much shorter run time than the 0-1 ILP-based transistor placement method, and can generate the cell layouts of 32 static dual CMOS logic circuits in 54% run time compared with the commercial tool. Area increase of our method without compaction is only 3% compared with the commercial tool with compaction.

  • A Device-Level Placement with Schema Based Clusters in Analog IC Layouts

    Takashi NOJIMA  Xiaoke ZHU  Yasuhiro TAKASHIMA  Shigetoshi NAKATAKE  Yoji KAJITANI  

     
    PAPER-Analog Layout

      Page(s):
    3301-3308

    A challenge to an automated layout of analog ICs starts with the insight into high quality placements crafted by experts. We observe first that matched devices or elemental functions such as input, output, amplifiers, etc are clustered. Second, devices in the same cluster are located faithfully to the drawn schema. Third, these two features are simultaneously fulfilled in a well-compacted placement. This paper proposes a novel device-level placement that simulates the above features based on Sequence-Pair. A slight modification of the meaning, say, of relation "A is left-of B" to relation "A is not right-of B" enlarges the freedom and allows a neater compaction of clusters allowing zigzag border curves. As the consequence, clusters are placed faithfully to relative position in the schema. We tested our algorithm for industrial instances and compared results with those by manual design. The results showed better features in performance figures than the those of manual designs by, on average, 13.5% and 21.2% with respect to the area and total net-length.

  • Automatic Extraction of Layout-Dependent Substrate Effects for RF MOSFET Modeling

    Zhao LI  Ravikanth SURAVARAPU  Kartikeya MAYARAM  C.-J. Richard SHI  

     
    PAPER-Device Modeling

      Page(s):
    3309-3317

    This paper presents CrtSmile--a CAD tool for the automatic extraction of layout-dependent substrate effects for RF MOSFET modeling. CrtSmile incorporates a new scalable substrate model, which depends not only on the geometric layout information of a transistor (the number of gate fingers, finger width, channel length and bulk contact location), but also on the transistor layout and bulk patterns. We show that this model is simple to extract and has good agreement with measured data for a 0.35 µm CMOS process. CrtSmile reads in the layout information of RF transistors in the CIF/GDSII format, performs a pattern-based layout extraction to recognize the transistor layout and bulk patterns. A scalable layout-dependent substrate model is automatically generated and attached to the standard BSIM3 device model as a sub-circuit for use in circuit simulation. A low noise amplifier is evaluated with the proposed CrtSmile tool, showing the importance of layout effects for RF transistor substrate modeling.

  • Application of High Quality Built-in Test Using Neighborhood Pattern Generator to Industrial Designs

    Kazumi HATAYAMA  Michinobu NAKAO  Yoshikazu KIYOSHIGE  Koichiro NATSUME  Yasuo SATO  Takaharu NAGUMO  

     
    LETTER-Test

      Page(s):
    3318-3323

    This letter presents a practical approach for high-quality built-in test using a test pattern generator called neighborhood pattern generator (NPG). NPG is practical mainly because its structure is independent of circuit under test and it can realize high fault coverage not only for stuck-at faults but also for transition faults. Some techniques are also proposed for further improvement in practical applicability of NPG. Experimental results for large industrial circuits illustrate the efficiency of the proposed approach.

  • A Novel Digitally-Controlled Varactor for Portable Delay Cell Design

    Pao-Lung CHEN  Ching-Che CHUNG  Chen-Yi LEE  

     
    LETTER-Physical Design

      Page(s):
    3324-3326

    In this paper, a novel digitally-controlled varactor (DCV) for portable delay cell design is presented. The proposed varactor uses the gate capacitance differences of NAND/NOR gates under different digital control inputs to build up a digitally-controlled varactor. Then the proposed varactor is applied to design a high resolution delay cell and to achieve a fine delay resolution. Different types of NAND/NOR gates (2-input or 3-input) for DCV design are also investigated in this paper. The proposed DCV can be implemented with standard cells, thus it can be easily ported to different processes in a short time. A test chip fabricated on a standard 0.35 µm CMOS 2P4M process proves that the proposed delay cell has a fine delay resolution about 1.55 ps. As a result, the proposed DCV exhibits finer resolution, better linearity, and better portability than traditional delay elements, and is very suitable for portable delay cell design.

  • Regular Section
  • Design of High-Order Noise-Shaping FIR Filters for Overload-Free Stable Single- and Multi-Bit Data Converters

    Mitsuhiko YAGYU  Akinori NISHIHARA  

     
    PAPER-Digital Signal Processing

      Page(s):
    3327-3333

    This paper presents optimum and sub-optimal designs of noise-shaping FIR filters for single- and multi-bit data converters. In the designs, only three parameters, the number of taps, oversampling ratio (OSR) and l1-norm of the filter coefficients are specified, and the in-band peak of the amplitude response is minimized under the specifications. The minimization problem is formulated with the overload-free condition, which guarantees the rigorous stability, and an overload-free converter generates no distortion in any output signals. In the optimum design, the minimization problem is directly and exactly solved, but the sub-optimal method solves this problem by iteratively utilizing the simplex method. The iterative sub-optimal method without the exact optimality is far faster and more efficient than the optimum method. In design examples, optimum and sub-optimal noise-shaping FIR filters for single- and multi-bit data converters are designed, and their optimal performance is revealed. For single-bit data converters with OSR 64, a noise-shaping FIR filter is designed and then shown to achieve a signal to noise and distortion ratio (SNDR) 107.6 [dB] in the band of interest.

  • Progressive Coding of Binary Voxel Models Based on Pattern Code Representation

    Bong Gyun ROH  Chang-Su KIM  Sang-Uk LEE  

     
    PAPER-Digital Signal Processing

      Page(s):
    3334-3342

    In this paper, we propose a progressive encoding algorithm for binary voxel models, which represent 3D object shapes. For progressive transmission, multi-resolution models are generated by decimating an input voxel model. Then, each resolution model is encoded by employing the pattern code representation(PCR). In PCR, the voxel model is represented with a series of pattern codes. The pattern of a voxel informs of the local shape of the model around that voxel. PCR can achieve a coding gain, since the pattern codes are highly correlated. In the multi-resolution framework, the coding gain can be further improved by exploiting the decimation constraints from the lower resolution models. Furthermore, the shell classification scheme is proposed to reduce the number of pattern codes to represent the whole voxel model. Simulation results show that the proposed algorithm provides about 1.1-1.3 times higher coding gain than the conventional PCR algorithm.

  • Blind Source Separation Based on Phase and Frequency Redundancy of Cyclostationary Signals

    Yong XIANG  Wensheng YU  Jingxin ZHANG  Senjian AN  

     
    PAPER-Digital Signal Processing

      Page(s):
    3343-3349

    This paper presents a new method for blind source separation by exploiting phase and frequency redundancy of cyclostationary signals in a complementary way. It requires a weaker separation condition than those methods which only exploit the phase diversity or the frequency diversity of the source signals. The separation criterion is to diagonalize a polynomial matrix whose coefficient matrices consist of the correlation and cyclic correlation matrices, at time delay τ= 0, of multiple measurements. An algorithm is proposed to perform the blind source separation. Computer simulation results illustrate the performance of the new algorithm in comparison with the existing ones.

  • Fixed-Point, Fixed-Interval and Fixed-Lag Smoothing Algorithms from Uncertain Observations Based on Covariances

    Seiichi NAKAMORI  Raquel CABALLERO-AGUILA  Aurora HERMOSO-CARAZO  Josefa LINARES-PEREZ  

     
    PAPER-Digital Signal Processing

      Page(s):
    3350-3359

    This paper treats the least-squares linear filtering and smoothing problems of discrete-time signals from uncertain observations when the random interruptions in the observation process are modelled by a sequence of independent Bernoulli random variables. Using an innovation approach we obtain the filtering algorithm and a general expression for the smoother which leads to fixed-point, fixed-interval and fixed-lag smoothing recursive algorithms. The proposed algorithms do not require the knowledge of the state-space model generating the signal, but only the covariance information of the signal and the observation noise, as well as the probability that the signal exists in the observed values.

  • A Practical Subspace Blind Identification Algorithm with Reduced Computational Complexity

    Nari TANABE  Toshihiro FURUKAWA  Kohichi SAKANIWA  Shigeo TSUJII  

     
    PAPER-Digital Signal Processing

      Page(s):
    3360-3371

    We propose a practical blind channel identification algorithm based on the principal component analysis. The algorithm estimates (1) the channel order, (2) the noise variance, and then identifies (3) the channel impulse response, from the autocorrelation of the channel output signal without using the eigenvalue and singular-value decomposition. The special features of the proposed algorithm are (1) practical method to find the channel order and (2) reduction of computational complexity. Numerical examples show the effectiveness of the proposed algorithm.

  • T-S Fuzzy Model-Based Synchronization of Time-Delay Chaotic System with Input Saturation

    Jae-Hun KIM  Hyunseok SHIN  Euntai KIM  Mignon PARK  

     
    PAPER-Systems and Control

      Page(s):
    3372-3380

    This paper presents a fuzzy model-based approach for synchronization of time-delay chaotic system with input saturation. Time-delay chaotic drive and response system is respectively represented by Takagi-Sugeno (T-S) fuzzy model. Specially, the response system contains input saturation. Using the unidirectional linear error feedback and the parallel distributed compensation (PDC) scheme, we design fuzzy chaotic synchronization system and analyze local stability for synchronization error dynamics. Since time-delay in the transmission channel always exists, we also take it into consideration. The sufficient condition for the local stability of the fuzzy synchronization system with input saturation and channel time-delay is derived by applying Lyapunov-Krasovskii theory and solving linear matrix inequalities (LMI's) problem. Numerical examples are given to demonstrate the validity of the proposed approach.

  • A Statistical Analysis of Non-linear Equations Based on a Linear Combination of Generalized Moments

    Hideki SATOH  

     
    PAPER-Nonlinear Problems

      Page(s):
    3381-3388

    A moment matrix analysis (MMA) method can derive macroscopic statistical properties such as moments, response time, and power spectra of non-linear equations without solving the equations. MMA expands a non-linear equation into simultaneous linear equations of moments, and reduces it to a linear equation of their coefficient matrix and a moment vector. We can analyze the statistical properties from the eigenvalues and eigenvectors of the coefficient matrix. This paper presents (1) a systematic procedure to linearize non-linear equations and (2) an expansion of the previous work of MMA to derive the statistical properties of various non-linear equations. The statistical properties of the logistic map were evaluated by using MMA and computer simulation, and it is shown that the proposed systematic procedure was effective and that MMA could accurately approximate the statistical properties of the logistic map even though such a map had strong non-linearity.

  • Maximum Likelihood Analysis of Masked Data in Competing Risks Models with an Environmental Stress

    Yoshimitsu NAGAI  

     
    PAPER-Reliability, Maintainability and Safety Analysis

      Page(s):
    3389-3396

    It is an important problem to estimate component reliabilities. For a series system due to cost and time constraints associated with failure analysis, all components cannot be investigated and the cause of failure is narrowed to a subset of components in some cases. When such a case occurs, we say that the cause of failure is masked. It is also necessary in some cases to take account of the influence of an environmental stress on all components. In this paper, we consider 2 and 3-component series systems when the component lifelengths are exponentially distributed and an environmental stress follows either a gamma or an inverse Gaussian distribution. We show that the lifelength of the system and the cause of failure are independent of each other. By comparison between the hazard functions in both models, we see that quite short and long lifelengths are more likely to occur in a gamma model than in an inverse Gaussian one. Assuming that the masking probabilities do not depend on which component actually fails, we show that the likelihood function can be factorized into three parts by a reparametrization. For some special cases, some estimators are given in closed-form. We use the computer failure data to see that our model is useful to analyze the real masked data. As compared with the Kaplan-Meier estimator, our models fit this computer data better than no environmental stress model. Further, we determine a suitable model using AIC. We see that the gamma model is fitted to the data better than the inverse Gaussian one. From a limited simulation study for a 3-component series system, we see that the relative errors of some estimators are inversely proportional to the square root of the expected number of systems whose cause of failure is identified.

  • Derivation on Bit Error Probability of Coded QAM Using Integer Codes

    Hristo KOSTADINOV  Hiroyoshi MORITA  Nikolai MANEV  

     
    PAPER-Communication Theory and Signals

      Page(s):
    3397-3403

    In this paper we present the exact expressions for the bit error probability over a Gaussian noise channel of coded QAM using single error correcting integer codes. It is shown that the proposed integer codes have a better performance with respect to the lower on the bit error probability for trellis coded modulation.

  • A New Feature Extraction for Iris Identification Using Scale-Space Filtering Technique

    Jinil HONG  Woo Suk YANG  Dongmin KIM  Young-Ju KIM  

     
    PAPER-Image

      Page(s):
    3404-3408

    In this paper, we introduce a new technology to extract the unique features from an iris image, which uses scale-space filtering. Resulting iris code can be used to develop a system for rapid and automatic human identification with high reliability and confidence levels. First, an iris part is separated from the whole image and the radius and center of the iris are evaluated. Next, the regions that have a high possibility of being noise are discriminated and the features presented in the highly detailed pattern are then extracted. In order to conserve the original signal while minimizing the effect of noise, scale-space filtering is applied. Experiments are performed using a set of 272 iris images taken from 18 persons. Test results show that the iris feature patterns of different persons are clearly discriminated from those of the same person.

  • Analysis and Evaluation of Required Precision for Color Images in Digital Cinema Application

    Junji SUZUKI  Isao FURUKAWA  Sadayasu ONO  

     
    PAPER-Image

      Page(s):
    3409-3419

    Digital cinema will continue, for some time, to use image signals converted from the density values of film stock through some form of digitization. This paper investigates the required numbers of quantization bits for both intensity and density. Equations for the color differences created by quantization distortion are derived on the premise that the uniform color space L* a* b* can be used to evaluate color differences in digitized pictorial color images. The location of the quantized sample that yields the maximum color difference in the color gamut is theoretically analyzed with the proviso that the color difference must be below the perceivable limit of human visual systems. The result shows that the maximum color difference is located on a ridge line or a surface of the color gamut. This can reduce the computational burden for determining the required precision for color quantization. Design examples of quantization resolution are also shown by applying the proposed evaluation method to three actual color spaces: NTSC, HDTV, and ROMM.

  • A View on the Fourier Integrals and Related Delta Function

    Yoshihiko AKAIWA  

     
    PAPER-General Fundamentals and Boundaries

      Page(s):
    3420-3423

    The Fourier integrals are treated as a rigorous extension of the Fourier series expansion. The reward for this is that so called, in the Fourier integrals, singular functions that are not absolutely integrable, e.g., trigonometric functions can be discussed within a field of ordinary function giving a foundation for the delta function as distribution.

  • Effect of Time Division on Estimation Accuracy in Frequency Domain ICA

    Yasunari YOKOTA  Hideaki IWATA  Motoki SHIGA  

     
    LETTER-Digital Signal Processing

      Page(s):
    3424-3428

    This study investigates the effect of the method of time division in frequency domain ICA on estimation accuracy of ICA. We show that source signals expressed in the frequency domain lose non-Gaussianity and independence because of the long and overlapping window function, respectively, in time division. Consequently, the estimation accuracy of ICA decreases.

  • A Construction of Low-Peak-Factor Pseudo White Noise

    Takafumi HAYASHI  

     
    LETTER-Digital Signal Processing

      Page(s):
    3429-3432

    A new construction of sequences having both a low peak factor (crest factor) and flat power spectrum is proposed. The flat power spectrum provides zero auto-correlation except for the case of zero shift. The proposed construction is based on a systematic scheme that does not require a search, and affords sequences of length 4n(2n+1) for an arbitrary integer n.

  • Digital Calibration Techniques for Pipelined ADCs

    Jeongpyo KIM  Yongchul SONG  Beomsup KIM  

     
    LETTER-Analog Signal Processing

      Page(s):
    3433-3435

    This paper describes a technique for background digital multistage calibration in the removal of nonlinearities caused by design limitations in pipelined analog-to-digital converters (ADCs). Foreground initialization reduces the calibration time. Furthermore, an improved background skip-and-fill method enables the ADC to trace environmental changes. This method uses a least mean square adaptive algorithm that is digitally implemented with a significantly reduced number of tap coefficients.

  • State Dependent Dwell Time Switching for Discrete-Time Stable Systems

    Jung-Su KIM  Tae-Woong YOON  Claudio DE PERSIS  

     
    LETTER-Systems and Control

      Page(s):
    3436-3438

    A switched nonlinear system is considered, and the interval between two consecutive switchings is assumed to be greater than a value called "the dwell time." When switching among nonlinear systems, using a constant dwell time generally fails to lead to stability. In this letter, a state dependent dwell time function with convergence guarantees is presented for discrete-time stable nonlinear systems.

  • Solutions of Takagi-Sugeno Fuzzy-Model-Based Dynamic Equations via Orthogonal Functions

    Wen-Hsien HO  Jyh-Horng CHOU  

     
    LETTER-Systems and Control

      Page(s):
    3439-3442

    The orthogonal function approach is developed in this paper to solve the Takagi-Sugeno (TS) fuzzy-model-based dynamic equations. The new method simplifies the procedure of solving the TS-fuzzy-model-based dynamic equations into the successive solution of a system of recursive formulae only involving matrix algebra. Based on the presented recursive formulae, an algorithm only involving straightforward algebraic computation is also proposed in this paper. The computational complexity can therefore be reduced remarkably. An illustrated example shows that the proposed method based on the orthogonal functions can obtain satisfactory results.

  • Security Notes on Generalization of Threshold Signature and Authenticated Encryption

    Shuhong WANG  Guilin WANG  Feng BAO  Jie WANG  

     
    LETTER-Information Security

      Page(s):
    3443-3446

    In 2000, Wang et al. proposed a (t,n) threshold signature scheme with (k,l) threshold shared verification, and a (t,n) threshold authenticated encryption scheme with (k,l) threshold shared verification. Later, Tseng et al. mounted some attacks against Wang et al.'s schemes. At the same, they also presented the improvements. In this paper, we first point out that Tseng et al.'s attacks are actually invalid due to their misunderstanding of Wang et al.'s Schemes. Then, we show that both Wang et al.'s schemes and Tseng et al.'s improvements are indeed insecure by demonstrating several effective attacks.

  • Performance of Cellular CDMA Systems Using SBF and TBF Array Antennas under Multi-Cell Environment

    Hyunduk KANG  Insoo KOO  Vladimir KATKOVNIK  Kiseon KIM  

     
    LETTER-Spread Spectrum Technologies and Applications

      Page(s):
    3447-3451

    In cellular systems, a code division multiple access (CDMA) technology with array antennas can significantly reduce interferences by taking advantage of the combination of spreading spectrum and spatial filtering. We investigate performance of cellular CDMA systems through adopting two types of array antennas, switched beam forming (SBF) and tracking beam forming (TBF) in the base station. Through Monte-Carlo simulations, we evaluate average bit-error-rate (BER) and outage probability of the systems under log-normal shadowing channels with multi-cell environment. When we consider 2 beams and 4 beams per sector for the SBF method, it is observed that the TBF method gives at least 10% and 30% capacity improvement over the SBF method in aspects of 10-3 BER and 1% outage probability, respectively.