1-19hit |
Koichi FUJIWARA Kazushi KAWAMURA Shin-ya ABE Masao YANAGISAWA Nozomu TOGAWA
Recently, high-level synthesis (HLS) techniques for FPGA designs are required in various applications such as computerized stock tradings and reconfigurable network processings. In HLS for FPGA designs, we need to consider module floorplan and reduce multiplexer's cost concurrently. In this paper, we propose a floorplan-driven HLS algorithm for multiplexer reduction targeting FPGA designs. By utilizing distributed-register architectures called HDR, we can easily consider module floorplan in HLS. In order to reduce multiplexer's cost, we propose two novel binding methods called datapath-oriented scheduling/FU binding and datapath-oriented register binding. Experimental results demonstrate that our algorithm can realize FPGA designs which reduce the number of slices by up to 47% and latency by up to 22% compared with conventional approaches while the number of required control steps is almost the same.
XiaoBo JIANG DeSheng YE HongYuan LI WenTao WU XiangMin XU
We propose an asynchronous datapath for the low-density parity-check decoder to decrease power consumption. Glitches and redundant computations are decreased by the asynchronous design. Taking advantage of the statistical characteristics of the input data, we develop novel key arithmetic elements in the datapath to reduce redundant computations. Two other types of datapaths, including normal synchronous design and clock-gating design, are implemented for comparisons with the proposed design. The three designs use similar architectures and realize the same function by using the 0.18µm process of the Semiconductor Manufacturing International Corporation. Post-layout result shows that the proposed asynchronous design exhibits the lowest power consumption. The proposed asynchronous design saves 48.7% and 21.9% more power than the normal synchronous and clock-gating designs, respectively. The performance of the proposed datapath is slightly worse than the clock-gating design but is better than the synchronous design. The proposed design is approximately 7% larger than the other two designs.
Sha SHEN Weiwei SHEN Yibo FAN Xiaoyang ZENG
This paper describes a unified VLSI architecture which can be applied to various types of transforms used in MPEG-2/4, H.264, VC-1, AVS and the emerging new video coding standard named HEVC (High Efficiency Video Coding). A novel design named configurable butterfly array (CBA) is also proposed to support both the forward transform and the inverse transform in this unified architecture. Hadamard transform or 4/8-point DCT/IDCT are used in traditional video coding standards while 16/32-point DCT/IDCT are newly introduced in HEVC. The proposed architecture can support all these transform types in a unified architecture. Two levels (architecture level and block level) of hardware sharing are adopted in this design. In the architecture level, the forward transform can share the hardware resource with the inverse transform. In the block level, the hardware for smaller size transform can be recursively reused by larger size transform. The multiplications of 4 or 8-point transform are implemented with Multiplierless MCM (Multiple Constant Multiplication). In order to reduce the hardware overhead, the multiplications of 16/32 point DCT are implemented with ICM (input-muxed constant multipliers) instead of MCM or regular multipliers. The proposed design is 51% more area efficient than previous work. To the author's knowledge, this is the first published work to support both forward and inverse 4/8/16/32-point integer transform for HEVC standard in a unified architecture.
Yutaka ARAYASHIKI Takashi KAMIZONO Yukio OHKUBO Taisuke MATSUMOTO Yoshiaki AMANO Yutaka MATSUOKA
We fabricated low-jitter 2:1 multiplexer (MUX) and 1:2 demultiplexer (DEMUX) modules for bit error rate testers that can be used for research into ultra-high-bitrate communication subsystems and devices with bitrates of over 100 Gbit/s. The 1:2 DEMUX IC design took into consideration an IC layout allowing module pin placement for optimal utility. With regard to mounting, the 2:1 MUX and 1:2 DEMUX modules were constructed using transmission lines of grounded coplanar waveguide (G-CPW) configuration, which offers excellent high-frequency characteristics. These modules operated at 113 Gbit/s with a low root mean square jitter of 548 fs and 587 fs, respectively.
Ryosuke HIRAKO Kiyo ISHII Hiroshi HASEGAWA Ken-ichi SATO Osamu MORIWAKI
We propose a compact matrix-switch-based hierarchical optical cross-connect (HOXC) architecture that effectively handles the colorless waveband add/drop ratio restriction so as to realize switch scale reduction. In order to implement the colorless waveband add/drop function, we develop a wavelength MUX/DMUX that can be commonly used by different wavebands. We prove that the switch scale of the proposed HOXC is much smaller than that of conventional single-layer optical cross-connects (OXCs) and a typical HOXC. Furthermore, we introduce a prototype system based on the proposed architecture that utilizes integrated novel wavelength MUXs/DMUXs. Transmission experiments prove its technical feasibility.
Yutaka ARAYASHIKI Yukio OHKUBO Taisuke MATSUMOTO Yoshiaki AMANO Akio TAKAGI Yutaka MATSUOKA
We fabricated a 2:1 multiplexer IC (MUX) with a retiming function by using 1-µm self-aligned InP/InGaAs/InP double-heterojunction bipolar transistors (DHBTs) with emitter mesa passivation ledges. The MUX operated at 120 Gbit/s with a power dissipation of 1.27 W and output amplitude of 520 mV when measured on the wafer. When assembled in a module using V-connectors, the MUX operated at 113 Gbit/s with a 514-mV output amplitude and a power dissipation of 1.4 W.
Yoshihito HASHIMOTO Shinichi YOROZU Yoshio KAMEDA
A cryocooled system with I/O interface circuits, which enables high-speed system operation of superconductive single-flux-quantum (SFQ) circuits at over 40 GHz, and the demonstration of a 47-Gbps SFQ 22 switch system are presented. The cryocooled system has 32 I/Os and cools an SFQ multi-chip module (MCM) to 4 K with a two-stage 1-W Gifford-McMahon cryocooler. An SFQ 4:1 multiplexer (MUX) and an SFQ 1:4 demultiplexer (DEMUX) have been designed to interface the speed gap between the I/O (~10 Gbps/ch) and SFQ circuits (>40 GHz). An SFQ 22 switch chip, in which the MUX/DEMUX and an SFQ 22 switch are integrated, and an 8-channel superconductive voltage driver (SVD) chip have been designed with an advanced cell library for a junction critical current density of 10 kA/cm2. An SFQ 22 switch MCM has been made by flip-chip bonding the switch chip and SVD chip on a superconductive MCM carrier with φ 50-µm InSn solder bumps. An SFQ 22 switch system, which is the switch MCM packaged in the cryocooled system, has been demonstrated up to a port speed of 47 Gbps for the first time.
Shoji KAKEHASHI Hiroshi HASEGAWA Ken-ichi SATO Osamu MORIWAKI
Recently we proposed a new waveband MUX/DEMUX that uses two concatenated cyclic AWGs. We analyse and formulate connection arrangements of the waveguides connecting the two AWGs. The port utilization of the device is shown to be 100% with bi-directional input fibers.
Rachid DRIAD Robert E. MAKON Karl SCHNEIDER Ulrich NOWOTNY Rolf AIDAM Rudiger QUAY Michael SCHLECHTWEG Michael MIKULLA Gunter WEIMANN
In this paper, we report a manufacturable InP DHBT technology, suitable for medium scale mixed-signal and monolithic microwave integrated circuits. The InGaAs/InP DHBTs were grown by MBE and fabricated using conventional process techniques. Devices with an emitter junction area of 4.8 µm2 exhibited peak cutoff frequency (fT) and maximum oscillation frequency (fMAX) values of 265 and 305 GHz, respectively, and a breakdown voltage (BVCEo) of over 5 V. Using this technology, a set of mixed-signal IC building blocks for ≥ 80 Gbit/s fibre optical links, including distributed amplifiers (DA), voltage controlled oscillators (VCO), and multiplexers (MUX), have been successfully fabricated and operated at 80 Gbit/s and beyond.
Two operations, polynomial multiplication and modular reduction, are newly induced by the properties of the modified Booth's algorithm and irreducible all one polynomials, respectively. A new and effective methodology is hereby proposed for computing multiplication over a class of fields GF(2m) using the two operations. Then a low complexity multiplexer-based multiplier is presented based on the aforementioned methodology. Our multiplier consists of m 2-input AND gates, an (m2 + 3m - 4)/2 2-input XOR gates, and m(m - 1)/2 4 1 multiplexers. For the detailed estimation of the complexity of our multiplier, we will expand this argument into the transistor count, using a standard CMOS VLSI realization. The compared results show that our work is advantageous in terms of circuit complexity and requires less delay time compared to previously reported multipliers. Moreover, our architecture is very regular, modular and therefore, well-suited for VLSI implementation.
Toshihide SUZUKI Yasuhiro NAKASHA Hideki KANO Masaru SATO Satoshi MASUDA Ken SAWADA Kozo MAKIYAMA Tsuyoshi TAKAHASHI Tatsuya HIROSE Naoki HARA Masahiko TAKIGAWA
In this paper, we describe the operation of circuits capable of more than 40-Gbit/s that we have developed using InP HEMT technology. For example, we succeeded in obtaining 43-Gbit/s operation for a full-rate 4:1Multiplier (MUX), 50-Gbit/s operation for a Demultiplexer (DEMUX), 50-Gbit/s operation for a D-type flip-flop (D-FF), and a preamplifier with a bandwidth of 40 GHz. In addition, the achievement of 90-Gbit/s operation for a 2:1MUX and a distributed amplifier with over 110-GHz bandwidth indicates that InP HEMT technology is promising for system operations of over 100 Gbit/s. To achieve these results, we also developed several design techniques to improve frequency response above 80 GHz including a symmetric and separated layout of differential elements in the basic SCFL gate and inverted microstrip.
Futoshi FURUTA Kazuo SAITOH Kazumasa TAKAGI
We have designed a demultiplexer (DMUX) with a simple structure, high-speed operation circuits and large bias margins. By using a binary-tree architecture and clock-driven circuits, multi-channel DMUXs can be constructed easily from the same elemental circuits, i.e., 1-to-2 DMUX, consisting of a T-FF and a 1-to-2 switch. By applying cell-level optimization and Monte Carlo simulation, bias margins and operation frequency of the circuits were enlarged. Logical operations of the 1-to-2 DMUX and a multi-channel DMUX, e.g., a 1-to-4 DMUX were experimentally confirmed. It was also confirmed that the large margins, 33% of the DMUX (1-to-2 switch) was kept up regardless the degree of integration, and that the 1-to-2 DMUX can operate up to 46 GHz by using measure of average voltages across Josephson junctions.
Lizhen ZHENG Xiaofan MENG Stephen WHITELEY Theodore Van DUZER
We present the design of dual rail Data Driven Self Timed (DDST) DEMUX and MUX circuits for 50 GHz operation. The chosen current density is 6.5 kA/cm2 and simulations show good margins for speeds exceeding 50 GHz. Our previously reported dual-rail on-chip test system is also scaled up for 50 GHz operation.
Toshiyuki OCHIAI Hideaki MATSUHASHI Hiroshi HOGA Satoshi NISHIKAWA
A high-speed static logic circuit, the 1:8 demultiplexer (DEMUX), fabricated using single-gate CMOS technology (single-gate means the structure consisting of n+ poly-Si gate for both NMOS and PMOS transistors) has been demonstrated. To suppress short-channel effects in PMOS transistors, we only used the low-energy ion implantation (I/I) of BF2 at 10 keV for counterdoping of the channel and that at 5 keV for source/drain (S/D) extension. To control the threshold voltage Vth of PMOS transistors precisely, the channel dopants were implanted after the growth of the gate oxide because of the suppression of the transient-enhanced diffusion (TED) of boron, and the suppression of boron out-diffusion. A tree-type 1:8 DEMUX circuit composed of 0. 134 µm gate CMOS transistors operates at a high speed of 3.1 GHz and consumes a low power of 35.5 mW/GHz at VDD = 2.0 V. In this single-gate CMOS circuit, down to this small gate length, the maximum operating frequency of the DEMUX circuit increases proportionally with an increase of the inverse of the gate length without an increase of power consumption per GHz. At a practical 2.48832 Gb/s operation, the power consumption was 88 mW, and the phase margin between the input clock signal and the input data signal was 260 ps. It is suggested that a circuit composed of a single-gate CMOS transistor with 0.15 µm gate generation can be applicable to high speed ICs.
A multimedia coding standard, MPEG4 has frozen its Committee Draft (CD) as the MPEG4 version 1 CD, last October. It defines Audio-Visual (AV) coding Algorithms and their System Multiplex/Composition formats. Founding on Object-base concept, Video part adopts Shape Coding technology in addition to conventional Texture Coding skills. Audio part consists of voice coding tools (HVXC and CELP core) and audio coding tools (HILN and MPEG2 AAC or Twin VQ). Error resilience technologies and Synthetic and Natural Hybrid Coding (SNHC) technologies are the MPEG4 specific features. System part defines flexible Multiplexing of audio-visual bitstreams and Scene Composition for user-interactive re-construction of the scenes at decoder side. The version 1 standardization will be finalized in 1998, with some possible minute changes. The expected application areas are real-time communication, mobile multimedia, internet/intranet accessing, broadcasting, storage media, surveillance, and so on.
Sadayuki YASUDA Yusuke OHTOMO Masayuki INO Yuichi KADO Toshiaki TSUCHIYA
We have developed a design technique for static logic circuits. Using this technique, we designed 1/2 divider-type 1:4 demultiplexer (DEMUX) and 2:1 selector-type 4:1 multiplexer (MUX) circuits, each of which is a key component in high-speed data multiplexing and demultiplexing. These circuits consist of double rail flip-flops (DR F/F). These flip-flops have a smaller mean internal capacitance than single rail flip-flops, making them suitable for high-speed operation. The DR F/F has a symmetric structure, so the double rail toggle flip-flop can put out an exactly balanced CK/CKN signal, which boosts the speed of the data flip-flops. The double rail structure enables 30% faster operation but consumes only 17% more power (per GHz) than a single rail circuit. In addition, our 0.25-µm process technology provides a 70% higher frequency operation than 0.5-µm process technology. At the supply voltage of 2.2 V, the DEMUX circuit and the MUX circuit operate at 4.55 GHz and 2.98 GHz, respectively. In addition, the 0.25-µm DEMUX circuit and the MUX circuit respectively consume 6.0 mW/GHz and 13.7 mW/GHz (@1.3 V), which are only 12% of the power consumed by 3.3-V 0.5-µm circuits. Because of its high-speed and low-power characteristics, our design technique will greatly contribute to the progress of large-scale high-speed telecommunication systems.
Norio HIGASHISAKA Masaaki SHIMADA Akira OHTA Kenji HOSOGI Kazuo KUBO Noriyuki TANINO
In order to establish design and measurement technologies for an LSI that features high speed operation and low power dissipation, GaAs 2.5 Gbps 16 bit MUX/DEMUX LSIs have been successfully developed. DCFL is employed as a basic gate in order to reduce the power dissipation. For the purpose of achieving stable operation against the transistor parameter deviation, a timing design called clock tracking is employed. Moreover, to ensure accurate performance measurement, a new measurement system is introduced. The measurement system consists of an error rate detector (ERD), a pulse pattern generator (PPG) and a high speed tester (HST). The performances tested by the measurement system show the power consumptions of MUX and DEMUX LSIs are 1.35 W and 0.95 W. Input phase margin of DEMUX LSI is 290 degrees at 2.5 Gbps operation. The technologies obtained through development of these MUX/DEMUX LSIs are applicable to other high speed and low power LSIs.
This paper reviews very high-speed optical signal processing technology based on the instantaneous characteristic of optical nonlinearities. Focus is placed on 100-Gbit/s optical time-division multiplexing (TDM) transmission systems. The key technologies including ultrashort optical pulse generation, all-optical multiplexing/demultiplexing and optical timing extraction techniques are alse described together with their major issues and future prospects.
This paper examines the key technologies and applications of optical frequency division multiplexing (OFDM) systems. It is clarified that a 100-channel OFDM system is feasible as a result of multichannel frequency stabilization, common optical amplification and channel selection utilizing a tunable optical filter. Transmission limitation due to fiber four-wave mixing is also described. Major functions and applications of the OFDM are summarized and the applicability of OFDM add/drop multiplexing is examined.