Kazuya YAMAMOTO Tetsuya HEIMA Akihiko FURUKAWA Masayoshi ONO Yasushi HASHIZUME Hiroshi KOMURASAKI Hisayasu SATO Naoyuki KATO
This paper describes two kinds of on-chip matched low-noise/driver MMIC amplifiers (LN/D-As) suitable for 2.4-GHz and 5.2-GHz short-range wireless applications. The ICs are fabricated in a 0.18 µm bulk CMOS which has no extra processing steps for enhancing the RF performance. The successful use of the current-reuse topology and interdigitated capacitors (IDCs) enables sufficiently low-noise and high output power operations with low current dissipation despite the chip fabrication in the bulk CMOS leading to large RF substrate and conductor losses. The main measurement results of the two LN/D-As are as follows: a 3.8-dB noise figure (NF) and a 10.1-dB gain under the conditions of 1.8 V and 6 mA, a 3.4-dBm 1-dB gain compressed output power (P1dB) for a 2.4-V voltage supply and a 13-mA operating current for the 2.4-GHz LN/D-A, and a 4.9-dB NF and an 11.1-dB gain with a 1.8 V and 10 mA supply condition, a 2.3-dBm P1dB at 2.4 V and 16 mA for the 5.2-GHz LN/D-A. Both MMICs are suited for low-noise amplifiers and driver amplifiers in 2.4-GHz and 5.2-GHz low-cost, low-power wireless systems such as Bluetooth and hiperLAN.
Sheng-He SUN Xiao-Dan MEI Zhao-Li ZHANG
A novel rough neural network (RNN) structure and its application are proposed in this paper. We principally introduce its architecture and training algorithms: the genetic training algorithm (GA) and the tabu search training algorithm (TSA). We first compare RNN with the conventional NN trained by the BP algorithm in two-dimensional data classification. Then we compare RNN with NN by the same training algorithm (TSA) in functional approximation. Experiment results show that the proposed RNN is more effective than NN, not only in computation time but also in performance.
Kwang-Deok SEO Kook-Yeol YOO Jae-Kyoon KIM
Quantization is an essential step which leads to compression in discrete cosine transform (DCT) domain. In this paper, we show how a statistically non-optimal uniform quantizer can be improved by employing an efficient reconstruction method. For this purpose, we estimate the probability distribution function (PDF) of original DCT coefficients in a decoder. By applying the estimated PDF into the reconstruction process, the dequantization distortion can be reduced. The proposed method can be used practically in any applications where uniform quantizers are used. In particular, it can be used for the quantization scheme of the JPEG and MPEG coding standards.
Hiroyuki TAKANO Takashi MIYAMORI Yasuhiro TANIGUCHI Yoshihisa KONDO
A 4GOPS 3 way-VLIW image recognition processor for an automobile system has been developed. The processor is based on a configurable and extensible media processor enabling optimization for a specific application by means of design-time configuration. Using VLIW coprocessor extension, the processor can satisfy the performance requirements of the system. Overhead by VLIW-mode instructions is only 7%. The VLIW co-processor occupies only 12% of the die area. Thus, good cost-performance for media processing in each embedded system can be achieved by this configurable media processor.
This paper proposes constructive timing-violation (CTV) and evaluates its potential. It can be utilized both for increasing clock frequency and for reducing energy consumption. Increasing clock frequency over that determined by the critical paths causes timing violations. On the other hand, while supply voltage reduction can result in substantial power savings, it also causes larger gate delay and thus clock must be slow down in order not to violate timing constraints of critical paths. However, if any tolerant mechanisms are provided for the timing violations, it is not necessary to keep the constraints. Rather, the violations would be constructive for high clock frequency or for energy savings. From these observations, we propose the CTV, which is supported by the tolerant mechanism based on contemporary speculative execution mechanisms. We evaluate the CTV using a cycle-by-cycle simulator and present its considerably promising potential.
Media processing has become one of the dominant computing workloads. In this context, SIMD instructions have been introduced in current processors to raise performance, often the main goal of microprocessor designers. Today, however, designers have become concerned with the power consumption, and in some cases low power is the main design goal (laptops). In this paper, we show that SIMD ISA extensions on a superscalar processor can be one solution to reduce power consumption and keeping a high performance level. We reduce the average power consumption by decreasing the number of instructions, the number of cache references, and using dynamic power management to transform the speedup in performance in power consumption reduction.
Tatsuo TERUYAMA Tetsuo KAMADA Masashi SASAHARA Shardul KAZI
The strong demand for complex and high performance system-on-a-chip requires high performance microprocessor core and quick turn around design methodology. We have developed 128-bit synthesizable core processor and tile based quick turn around design methodology. It is 200 MHz MIPS compatible processor with 128-bit SIMD extension and is targeted for consumer electronics. We also developed an ASSP including the processor core, SDRAM controller, 2 PCI and 2 MAC mainly for network applications. For SOC development, we developed a tile based design methodology aiming at quick design convergence. The initial RTL design is synthesized and partitioned to several tiles by in-house tiling tool. It promises quick turn around from RTL design to tape out using the concurrency of the back-end design.
In order to design oscillators and switches phase noise characteristic is the key to obtain high quality frequency spectrums. Since the phase noise is directly affected by the 1/f noise of transistors in the circuit, 1/f noise measurement and modeling are important. This paper describes 1/f noise measurement, frequency and bias dependent flicker noise model, and noise parameter extraction method of MOSFET's. Also, for MOSFET's geometry dependencies of drain current 1/f noise are analyzed and modeled. The model has been verified by measuring the noise current spectral density of MOSFET's in two different process devices.
Takao OURA Teru YONEYAMA Shashidhar TANTRY Hideki ASAI
In this report, we propose a new bilateral floating resistor circuit having both positive and negative resistance values. The equivalent resistance of this floating resistor in CMOS technology can be changed by using controlled-voltages, which is an advantage over polysilicon or diffused resistor in the integrated circuit. Moreover the characteristics of the proposed circuit are independent of the threshold voltage. We have simulated the proposed circuit by using HSPICE. Finally, we have confirmed that the proposed circuit is useful as an analog component.
Hitoshi MURAI Hiromi T. YAMADA Kozo FUJII
The initial phase alternation of RZ pulses having duty cycle beyond 50% in dispersion-managed-link is found to help stabilize DM solitons transmissions. The stable soliton propagation of such wide RZ pulses should ease the difficulties designing soliton-based DWDM systems due to less spectral occupancy/channel. For the proof of concept, 40 Gbit/s WDM transmissions are numerically investigated and the initial phase alternation improved the transmission distance by the factor of 2 in the soliton-soliton interaction limited regime. The advantage of this concept has also been verified by conducting 40 Gbit/s single and 8 channels WDM transmission experiments using OTDM techniques with initial phase alternation.
Kenichi SUZUKI Mitsuhiro TAKEDA Atsushi KAMO Hideki ASAI
This letter presents a novel application of the Verilog-A, which is a hardware description language for analog circuits, to the modeling and simulation of high-speed interconnects in time/frequency transform-domain for signal integrity problems. This modeling method with the Verilog-A language would handle the transfer function approximation and admittance matrices, which are expressed by the dominant poles and residues as used in AWE technique. Finally, it is shown that modeling and simulation of the high-speed interconnects with nonlinear terminations can be done easily.
May SUZUKI Manabu KAWABE Takashi YANO Junko KIYOTA Hirotake ISHII Tsuyoshi TAMAKI Nobukazu DOI
In this paper, a new multi-engine architecture for the baseband modem LSI of W-CDMA systems is proposed. The developed test chip with this architecture is also presented. In the multi-engine architecture, processors and wired logic are combined to obtain both flexibility and low power dissipation. Multiple processors are used in the LSI to lower its operating frequency by distributed processing. A customized processor is used to lower the overhead of multiple processors in terms of LSI scale. The test chip was fabricated with a 0.25-µm process. Its measured power dissipation for simultaneous 384 kbit/s downlink reception and 64 kbit/s uplink transmission was 160 mW.
Johannes KNEIP Matthias WEISS Wolfram DRESCHER Volker AUE Jurgen STROBEL Thomas OBERTHUR Michael BOLLE Gerhard FETTWEIS
This paper presents the HiperSonic 1, a multi-standard, application-specific signal processor, designed to execute the baseband conversion algorithms in IEEE802.11a- and HIPERLAN/2-based 5 GHz wireless LAN applications. In contrast to widely existing, dedicated implementations, most of the computational effort here was mapped onto a configurable, data- and instruction-parallel DSP core. The core is supplemented by mixed signal A/D, D/A converters and hardware accelerators. Memory and register architecture, instruction set and peripheral interfaces of the chip were carefully optimized for the targeted applications, leading to a sound combination of flexibility, die area and power consumption. The 120 MHz, 7.6 million-transistor solution was implemented in 0.18 µm CMOS and performs IEEE802.11a or HiperLAN/2 compliant baseband processing at data rates up to 60 Mbit/s.
Shih-Chang HSIA I-Chang JOU Shing-Ming HWANG
Watermarking techniques are widely used to protect the secret document. In some valuable literatures, most of them concentrate on the binary data watermarking by using comparisons of an original image and a watermarked image to extract the watermark. In this paper, an efficient watermarking algorithm is presented with two-layer hidden for gray-level image watermarking. In the first layer, the key information is found based on the codebook concept. Then the secret key is further hidden to the watermarked image adopting the encryption consisting of spatial distribution in the second layer. The simulations demonstrate that the watermarking information is perceptually invisible in the watermarked image. Moreover, the gray-level watermark can be extracted by referring key parameters rather than the original image, and the extracting quality is very good.
Today, an ultra-high capacity transmission system based on N40 Gb/s channel rate is the most promising approach to achieve multi-terabit/s of capacity over a single fiber. We have demonstrated 5.12 Tbit/s transmission of 128 channels at 40 Gbit/s over 3100 km and 10.24 Tbit/s transmission of 256 channels at 42.6 Gbit/s (using FEC) over 100 km, based on four main technologies: 40 Gbit/s electrical time-division multiplexing (ETDM), vestigial sideband demultiplexing (VSB), advanced amplifier technology including Raman amplification and TeraLightTM fiber. A record spectral efficiency of 1.28 bit/s/Hz is applied to achieve 10.24 Tbit/s transmission within the C- and L-band.
Koji HOSAKA Shinichi HARASE Shoji IZUMIYA Takehiko ADACHI
A cascode crystal oscillator is widely used for the stable frequency source of mobile communication equipments. Recently, IC production of the cascode crystal oscillator has become necessary. The cascode crystal oscillator is composed of a colpitts crystal oscillator and a cascode connected base-common buffer amplifier. The base bypass condenser prevents the area size reduction. In this paper, we have proposed the new structures of the cascode crystal oscillator suitable for integrated circuits. The proposed circuits have the advantages on reduction of the area size and start-up time without deteriorating the frequency stability against the load impedance variation and other performances. The simulation and experiment have shown the effectiveness of the proposed circuits.
Kiyoko KATAYANAGI Yasuyuki MURAKAMI Masao KASAHARA
Recently, Kasahara and Murakami proposed new product-sum type public-key cryptosystems with the Chinese remainder theorem, Methods B-II and B-IV. They also proposed a new technique of selectable encryption key, which is referred to as 'Home Page Method (HP Method).' In this paper, first, we describe Methods B-II and B-IV. Second, we propose an effective attack for Method B-II and discuss the security of Methods B-II and B-IV. Third, applying the HP Method to Methods B-II and B-IV, we propose new product-sum type PKC with selectable encryption key. Moreover, we discuss the security of the proposed cryptosystems.
Naohiko IRIE Fumio ARAKAWA Kunio UCHIYAMA Shinichi YOSHIOKA Atsushi HASEGAWA Kevin IADONATE Mark DEBBAGE David SHEPHERD Margaret GEARTY
An embedded processor core using split branch architecture has been developed. This processor core targets 400 MHz using 0.18 µm technology, and its higher frequency needs deeper pipeline than the conventional processor. To solve the increasing branch penalty problem caused by a deeper pipeline, this processor takes an active preload mechanism to preload the target instructions to internal buffers in order to hide the instruction cache latency. The processor also uses multiple instruction buffers to reduce branch penalty cycles of branch misprediction. The performance estimation result shows that about 70% of branch overhead cycles can be reduced from the conventional implementation. The area for this branch mechanism consumes only 1% of the total core, which is smaller than the conventional branch target buffer (BTB) scheme, and helps to achieve low power and low cost.
Ganesan UMANESAN Eiji FUJIWARA
Existing byte error control codes require too many check bits if applied to a memory system that uses recent semiconductor memory chips with wide I/O data such as 16 or 32 bits, i.e., b=16 or 32. On the other hand, semiconductor memory chips are highly vulnerable to random double bit within a memory chip errors when they are used in some applications, such as satellite memory systems. Under this situation, it becomes necessary to design suitable new codes with double bit within a chip error correcting capability for computer memory systems. This correspondence proposes a class of codes called Double bit within a block Error Correcting - Single b-bit byte Error Correcting ((DEC)B-SbEC) codes where block and byte correspond to memory chip and memory sub-array data outputs, respectively. The proposed codes provide protection from both random double bit errors and single sub-array data faults. For most of the practical cases, the (DEC)B-SbEC codes presented in this correspondence have the capability of accommodating the check bits in a single dedicated memory chip.
A fast, low-power 16-bit adder, 32-word register file and 512-bit cache SRAM have been developed using 0.25-µm GaAs HEMT technology for future multi-GHz processors. The 16-bit adder, which uses a negative logic binary look-ahead carry structure based on NOR gates, operates at the maximum clock frequency of 1.67 GHz and consumes 134.4 mW at a supply voltage of 0.6 V. The active area is 1.6 mm2 and there are about 1,230 FETs. A new DC/DC level converter has been developed for use in high-speed, low-power storage circuits such as SRAMs and register files. The level converter can increase the DC voltage, which is supplied to an active-load circuit on request, or supply a minimal DC voltage to a load circuit in the stand-by mode. The power dissipation (P) of the 32-word register file with on-chip DC/DC level converters is 459 mW, a reduction to 25.2% of that of an equivalent conventional register file, while the operating frequency (fc) was 5.17 GHz that is 74.8% of fc for the conventional register file. P for the 512-bit cache SRAM with the new DC/DC level converters is 34.3 mW, 89.7% of the value for an equivalent conventional cache SRAM, with the read-access time of 455 psec, only 1.1% longer than that of the conventional cache SRAM.