Sung Woo CHUNG Gi Ho PARK Sung Bae PARK
This letter proposes a low-power tournament branch predictor, in which the number of accesses to the branch predictors (local predictor or global predictor) is reduced. Analysis results with Samsung Memory Compiler show that the proposed branch predictor reduces the power consumption by 24-45%, compared to the conventional tournament branch predictor, not requiring any additional storage arrays, not incurring any additional delay and never harming accuracy.
Shinji TANAKA Tetsuyasu YAMADA Satoshi SHIRAISHI
The sizes of recent Java-based server-side applications, like J2EE containers, have been increasing continuously. Past techniques for improving the performance of Java applications have targeted relatively small applications. Moreover, when the methods of these small target applications are invoked, they are not usually distributed over the entire memory space. As a result, these techniques cannot be applied efficiently to improve the performance of current large applications. We propose a dynamic code repositioning approach to improve the hit rates of instruction caches and translation look-aside buffers. Profiles of method invocations are collected when the application performs with its heaviest processor load, and the code is repositioned based on these profiles. We also discuss a method-splitting technique to significantly reduce the sizes of methods. Our evaluation of a prototype implementing these techniques indicated 5% improvement in the throughput of the application.
Hiroshi YOSHIDA Takehiko TOYODA Ichiro SETO Ryuichi FUJIMOTO Osamu WATANABE Tadashi ARAI Tetsuro ITAKURA Hiroshi TSURUMI
A fully differential direct conversion receiver IC for W-CDMA is presented. The receiver IC consists of an LNA, a quadrature demodulator, low-pass filters (LPFs), and variable gain amplifiers (VGAs). In order to suppress DC offset, which is the most important issue in a direct conversion system, an active harmonic mixer is applied to the quadrature demodulator. Furthermore, a receiving system, including the LNA and an RF filter, adopts a differential architecture to reduce local signal leakage, which generates DC offset. Performance of the entire receiving system was evaluated and DC offset in steady state was measured at only 40 mV. Moreover, DC offset variation at the LNA gain change, which has the largest affect on the receiving performance, was limited to 70 mV, which is less than -10 dB compared to desired signal strength. It was confirmed by computer simulation that the DC offset variation at the LNA gain change did not degrade bit error rate (BER) performance at all.
Ye LIU Zheng-Fan LI Mei XUE Rui-Feng XUE
Integral equation method is used to compute three-dimension-structure capacitance in this paper. Since some multi-conductor structures present regular periodic property, the periodic cell is used to reduce the computational domain with adding appropriate magnetic and electric walls. The periodic Green's function in the integral equation method is represented in the form of infinite series with slow convergence. In this paper, Shanks transformation is used to accelerate the convergence. Numerical examples show that the proposed method is accurate with a much higher efficiency in capacitance extraction for 3-D periodic structures.
Mamoru UGAJIN Junichi KODATE Tsuneo TSUKAHARA
This paper describes a 2.4-GHz downconverter that runs on a 1-V supply. The downconverter integrates an LNA, a quadrature mixer, a complex channel-select band-pass filter (BPF), a limiting amplifier, and a frequency doubler using 0.2-µm CMOS/SOI technology. The frequency doubler doubles the frequency deviation of FM signals as well as the frequency itself, which in turn doubles the modulation index. This improves the sensitivity of FM demodulation. The power consumption of the downconverter is 23 mW with a 1-V power supply. A bit-error-rate (BER) measurement using the downconverter and a demodulation IC shows -76.5-dBm sensitivity at a 0.1% BER.
A linear-in-dB gain-control amplifier for direct conversion systems employs linearized transconductors in a core amp, a dc offset canceler, and a gain control circuit. The offset compensation circuit achieves a constant corner frequency over a gain range of 14 to 76 dB by simultaneous tuning of the transconductors.
Hiroki HIGA Ikuo NAKAMURA Nozomu HOSHIMIYA
As one of control command input methods for functional electrical stimulation (FES) system, using the head movements was considered in this paper. In order to detect the head movements, we designed a prototype control command input device using acceleration sensors and verified its validity in experiments. The experimental results showed that the head movements in the lateral flexion and in the flexion/extension were highly detected and separated by the acceleration sensors.
Xuzhen XIE Takao ONO Shin-ichi NAKANO Tomio HIRATA
A nearly equitable edge-coloring of a multigraph is a coloring such that edges incident to each vertex are colored equitably in number. This problem was solved in O(kn2) time, where n and k are the numbers of the edges and the colors, respectively. The running time was improved to be O(n2/k + n|V|) later. We present a more efficient algorithm for this problem that runs in O(n2/k) time.
Sangheon PACK Taewan YOU Yanghee CHOI
In mobile multimedia environment, it is very important to minimize handoff latency due to mobility. In terms of reducing handoff latency, Hierarchical Mobile IPv6 (HMIPv6) can be an efficient approach, which uses a mobility agent called Mobility Anchor Point (MAP) in order to localize registration process. However, MAP can be a single point of failure or performance bottleneck. In order to provide mobile users with satisfactory quality of service and fault-tolerant service, it is required to cope with the failure of mobility agents. In, we proposed Robust Hierarchical Mobile IPv6 (RH-MIPv6), which is an enhanced HMIPv6 for fault-tolerant mobile services. In RH-MIPv6, an MN configures two regional CoA and registers them to two MAPs during binding update procedures. When a MAP fails, MNs serviced by the faulty MAP (i.e., primary MAP) can be served by a failure-free MAP (i.e., secondary MAP) by failure detection/recovery schemes in the case of the RH-MIPv6. In this paper, we investigate the comparative study of RH-MIPv6 and HMIPv6 under several performance factors such as MAP unavailability, MAP reliability, packet loss rate, and MAP blocking probability. To do this, we utilize a semi-Markov chain and a M/G/C/C queuing model. Numerical results indicate that RH-MIPv6 outperforms HMIPv6 for all performance factors, especially when failure rate is high.
Intae HWANG Jungyoung SON Sukki HAHN Young-Hwan YOU Daesik HONG Changeon KANG
Rapid time variations of the mobile communication channel have a dramatic impact on the performance of multicarrier modulation. This letter analyzes the effect of the Doppler-induced interchannel interference (ICI) on a space-time block coded (STBC) OFDM-CDMA system in a time-varying Rayleigh fading channel. At the same time, we compute the effect of the ICI on the BER performance of the STBC OFDM-CDMA system using the maximal ratio combining (MRC) and equal gain combining (EGC) schemes.
Hirokazu KAWABATA Hiroshi TANPO Yoshio KOBAYASHI
A rigorous analysis for a TM010 mode cylindrical cavity with insertion holes is presented on the basis of the Ritz-Galerkin method to realize accurate measurements of the complex permittivity of liquid. The effects of sample insertion holes, a dielectric tube, and air-gaps between a dielectric tube and sample insertion holes are taken into account in this analysis. The validity of this method is verified from measured results of some kinds of liquid.
Because it has desirable features such as no cascading rollback, fast output commit and asynchronous logging, causal message logging needs a consistent recovery algorithm to tolerate concurrent failures. For this purpose, Elnozahy proposed a centralized recovery algorithm to have two practical benefits, i.e. reducing the number of stable storage accesses and imposing no restriction on the execution of live processes during recovery. However, the algorithm with independent checkpointing may force the system to be in an inconsistent state when processes fail concurrently. In this paper, we identify these inconsistent cases and then present a recovery algorithm to have the two benefits and ensure the system consistency when integrated with any kind of checkpointing protocol. Also, our algorithm requires no additional message compared with Elnozahy's algorithm.
Hisakazu SATO Yasuhiro NUNOMURA Niichi ITOH Koji NII Kanako YOSHIDA Hironobu ITO Jingo NAKANISHI Hidehiro TAKATA Yasunobu NAKASE Hiroshi MAKINO Akira YAMADA Takahiko ARAKAWA Toru SHIMIZU Yuichi HIRANO Takashi IPPOSHI Shuhei IWADE
A low-power microcontroller has been developed with 0.10 µm bulk compatible body-tied SOI technology. For this work, only two new masks are required. For the other layers, existing masks of a prior work developed with 0.18 µm bulk CMOS technology can be applied without any changes. With the SOI technology, the high-speed operation of over 600 MHz has been achieved at a supply voltage of 1.2 V, which is 1.5 times faster than prior work. Also, a five times improvement in the power-delay product has been achieved at a supply voltage 0.8 V. Moreover, the compatibility of the SOI technology with bulk CMOS has been verified, because all circuit blocks of the chip, including logic, memory, analog circuit, and PLL, are completely functional, even though only two new masks are used.
Eiji KONAKA Tatsuya SUZUKI Shigeru OKUMA
The PLC (Programmable Logic Controller) has been widely used in the industrial world as a controller for manufacturing systems, as a process controller and so on. The conventional PLC has been designed and verified as a pure Discrete Event System (DES) by using an abstract model of a controlled plant. In verifying the PLC, however, it is also important to take into account the physical behavior (e.g. dynamics, shape of objects) of the controlled plant in order to guarantee such important factors as safety. This paper presents a new verification technique for the PLC-based control system, which takes into account these physical behaviors, based on a Hybrid Dynamical System (HDS) framework. The other key idea described in the paper is the introduction of the concept of signed distance which not only measures the distance between two objects but also checks whether two objects interfere with each other. The developed idea is applied to illustrative material handling problems, and its usefulness is demonstrated.
Hiroyuki TOMIYAMA Hiroaki TAKADA Nikil D. DUTT
Energy consumption has become one of the most critical constraints in the design of portable multimedia systems. For media applications, address buses between processor and data memory consume a considerable amount of energy due to their large capacitance and frequent accesses. This paper studies impacts of memory data organization on the address bus energy. Our experiments show that the address bus activity is significantly reduced by 50% through exploring memory data organization and encoding address buses.
Je-Hoon LEE YoungHwan KIM Kyoung-Rok CHO
In this paper, we design and implement a fast asynchronous embedded CISC microprocessor, A8051, introducing well-tuned pipeline architecture and enhanced control schemes. This work shows an asynchronous design methodology for a CISC type processor, handling the complicated control structure and various instructions. We tuned the proposed architecture to the 5-stage pipeline, reducing the number of idle stages. For the work, we regrouped the instructions based on the number of the machine cycles identified. A8051 has three enhanced control features to improve the system performance: multi-looping control of the pipeline stage, variable length instruction register to get a multiple word instruction in a time, and branch prediction accelerating. The proposed A8051 was synthesized to a gate level design using a 0.35 µm CMOS standard cell library. Simulation results indicate that A8051 provides about 18 times higher speed than the traditional Intel 8051 and about 5 times higher speed than the previously designed asynchronous 8051. In power consumption, core of A8051 shows 15 times higher MIPS/Watt than the synchronous H8051.
This paper presents a computationally efficient subspace-based method for partially adaptive beamforming which is based on the structure of the generalized sidelobe canceller (GSC). Its auxiliary beamformer operates in an estimated interference subspace which is obtained through simple computation. The computational burden of the proposed method in terms of complex multiplication is just on O(η2M) where η and M are the numbers of interferences and the array elements, respectively. Though the subspace obtained is different from the exact interference subspace due to the presence of noise, theoretical analysis shows that the proposed beamfomer virtually attains the optimal performance for strong or sidelobe interference. Simulation results validate its effectiveness including fast convergence, even in the presence of errors in the detected number of directional signals.
The capabilities of reliable computations in one-dimensional cellular automata are investigated by means of the Early Bird Problem. The problem is typical for situations in massively parallel systems where a global behavior must be achieved by only local interactions between the single elements. The cells that cause the misoperations are assumed to behave as follows. They run a self-diagnosis before the actual computation once. The result is stored locally such that the working state of a cell becomes visible to its neighbors. A non-working (defective) cell cannot modify information but is able to transmit it unchanged with unit speed. We present an O(n log (n) log (n))-time fault-tolerant solution of the Early Bird Problem.
In this paper we study a classical firing squad synchronization problem on a model of fault-tolerant cellular automata that have possibly some defective cells. Several fault-tolerant time-efficient synchronization algorithms are developed based on a simple freezing-thawing technique. It is shown that, under some constraints on the distribution of defective cells, any cellular array of length n with p defective cell segments can be synchronized in 2n - 2 + p steps.
This paper presents a transmit diversity scheme that allocates space-time block codes (STBC) to beamspace and spreading codes for two-dimensional spreading orthogonal frequency-division multiplexing code-division multiplexing (OFDM-CDM) downlink transmission. In this scheme, the STBC output symbols are beam-steered using a pair of neighboring beams selected via closed-loop beam selection. The beam-steered symbols in two adjacent time slots are spread by two distinct spreading codes and multiplexed in the same spreading segment. User signals transmitted from different pairs of beams, but that share the same beam, interfere with each other when decoding STBC. Spreading codes are thus allocated to users according to beam pairs used. This is to suppress the interference in time-direction despreading that precedes decoding of STBC. Simulation results demonstrated that the proposed scheme provides beam gains or beam diversity gains or both and that it alleviates inter-code interference by spatially separating user signals by using transmit beam. The proposed scheme also provides high tolerance to large Doppler spread.