Takafumi AOKI Naofumi HOMMA Tatsuo HIGUCHI
This paper presents a new approach to designing arithmetic circuits by using a graph-based evolutionary optimization technique called Evolutionary Graph Generation (EGG). The key idea of the proposed method is to introduce a higher level of abstraction for arithmetic algorithms, in which arithmetic circuit structures are modeled as data-flow graphs associated with specific number representation systems. The EGG system employs evolutionary operations to transform the structure of graphs directly, which makes it possible to generate the desired circuit structure efficiently. The potential capability of EGG is demonstrated through an experiment of generating constant-coefficient multipliers.
Isao TAKENAKA Hidemasa TAKAHASHI Kazunori ASANO Kohji ISHIKURA Junko MORIKAWA Hiroaki TSUTSUI Masaaki KUZUHARA
This paper describes a high-power and low-distortion AlGaAs/GaAs HFET amplifier developed for digital cellular base station system. We proved experimentally that distortion characteristics such as IMD (Intermodulation Distortion) or NPR (Noise Power Ratio) are drastically degraded when the absolute value of the drain bias circuit impedance at low frequency are high. Based on the experimental results, we have designed the drain bias circuit not to influence the distortion characteristics. The developed amplifier employed two pairs of pre-matched GaAs chips mounted on a single package and the total output-power was combined in push-pull configuration with a microstrip balun circuit. The push-pull amplifier demonstrated state-of-the-art performance of 140 W output-power with 11.5 dB linear gain at 2.2 GHz. In addition, it exhibited extremely low distortion performance of less than 30 dBc at two-tone total output-power of 46 dBm. These results indicate that the design of the drain bias circuit is of great importance to achieve improved IMD characteristics while maintaining high power performance.
Manabu FUKUSHIMA Takatoshi OKUNO Hirofumi YANAGAWA Ken'iti KIDO
This paper proposes a method of improving the accuracy of the attenuation constant estimate obtained by using the cross-spectral technique. In the cross-spectral technique, the envelope of the estimated impulse response is deformed due to the use of a time window. As a result, the estimated impulse response decays more rapidly than the real impulse response does, and the attenuation constant obtained by the estimated impulse response becomes larger than the real value. This paper first describes how the attenuation constant changes in the process of impulse response estimation. Next, we propose a method of improving the accuracy of the estimation. The effect of the proposed method is confirmed by computer simulation.
Kozo TAGUCHI Kaname FUKUSHIMA Atsuyuki ISHITANI Masahiro IKEDA
We first demonstrate a self-pulsation phenomenon in a semiconductor ring laser(SRL). Not only self-mode-locked optical pulse but self-Q-switched optical pulse can be observed in a SRL. Furthermore, experimental results show that the repetition period of the Q-switched optical pulse train can be controlled by the injection current to a SRL.
Hiroaki KIKUCHI Michael HAKAVY Doug TYGAR
Auctions are a critical element of the electronic commerce infrastructure. But for real-time applications, auctions are a potential problem - they can cause significant time delays. Thus, for most real-time applications, sealed-bid auctions are recommended. But how do we handle tie-breaking in sealed-bid auctions? This paper analyzes the use of multi-round auctions where the winners from an auction round participate in a subsequent tie-breaking second auction round. We perform this analysis over the classical first-price sealed-bid auction that has been modified to provide full anonymity. We analyze the expected number of rounds and optimal values to minimize communication costs.
Naokazu YOKOYA Takeshi SHAKUNAGA Masayuki KANBARA
Acquisition of three-dimensional information of a real-world scene from two-dimensional images has been one of the most important issues in computer vision and image understanding in the last two decades. Noncontact range acquisition techniques can be essentially classified into two classes: Passive and active. This paper concentrates on passive depth extraction techniques which have the advantage that 3-D information can be obtained without affecting the scene. Passive range sensing techniques are often referred to as shape-from-x, where x is one of visual cues such as shading, texture, contour, focus, stereo, and motion. These techniques produce 2.5-D representations of visible surfaces. This survey discusses aspects of this research field and reviews some recent advances including video-rate range imaging sensors as well as emerging themes and applications.
Computational sensor (smart sensor, vision chip in other words) is a very small integrated system, in which processing and sensing are unified on a single VLSI chip. It is designed for a specific targeted application. Research activities of computational sensor are described in this paper. There have been quite a few proposals and implementations in computational sensors. Firstly, their approaches are summarized from several points of view, such as advantage vs. disadvantage, neural vs. functional, architecture, analog vs. digital, local vs. global processing, imaging vs. processing, new processing paradigms. Then, several examples are introduced which are spatial processings, temporal processings, A/D conversions, programmable computational sensors. Finally, the paper is concluded.
Muneharu YOKOYAMA Takaomi SHIGEHARA Hiroshi MIZOGUCHI Taketoshi MISHIMA
The Conjugate Residual method, one of the iterative methods for solving linear systems, is applied to the problems with a dense coefficient matrix on distributed memory parallel computers. Based on an assumption on the computation and communication times of the proposed algorithm for parallel computers, it is shown that the optimal number of processing elements is proportional to the problem size N. The validity of the prediction is confirmed through numerical experiments on Hitachi SR2201.
Katsuhiko SAKAUE Akira AMANO Naokazu YOKOYA
In this paper, the authors present general views of computer vision and image processing based on optimization. Relaxation and regularization in both broad and narrow senses are used in various fields and problems of computer vision and image processing, and they are currently being combined with general-purpose optimization algorithms. The principle and case examples of relaxation and regularization are discussed; the application of optimization to shape description that is a particularly important problem in the field is described; and the use of a genetic algorithm (GA) as a method of optimization is introduced.
The goal of this paper is to present a critical survey of existing literature on an omnidirectional sensing. The area of vision application such as autonomous robot navigation, telepresence and virtual reality is expanding by use of a camera with a wide angle of view. In particular, a real-time omnidirectional camera with a single center of projection is suitable for analyzing and monitoring, because we can easily generate any desired image projected on any designated image plane, such as a pure perspective image or a panoramic image, from the omnidirectional input image. In this paper, I review designs and principles of existing omnidirectional cameras, which can acquire an omnidirectional (360 degrees) field of view, and their applications in fields of autonomous robot navigation, telepresence, remote surveillance and virtual reality.
Sang-Joon NAM In-Cheol PARK Chong-Min KYUNG
This paper presents a new approach to the precise interrupt handling problem in modern processors with multiple out-of-order issues. It is difficult to implement a precise interrupt scheme in the processors because later instructions may change the process states before their preceding instructions have completed. We propose a fast precise interrupt handling scheme which can recover the precise state in one cycle if an interrupt occurs. In addition, the scheme removes all the associative searching operations which are inevitable in the previous approaches. To deal with the renaming of destination registers, we present a new bank-based register file which is indexed by bank index tables containing the bank identifiers of renamed register entries. Simulation results based on the superscalar MIPS architecture show that the register file with 3 banks is a good trade-off between high performance and low complexity.
Takashi MIYAMORI Kunle OLUKOTUN
This paper describes a new reconfigurable processor architecture called REMARC (Reconfigurable Multimedia Array Coprocessor). REMARC is a small array processor that is tightly coupled to a main RISC processor. It consists of a global control unit and 64 16-bit processors called nano processors. REMARC is designed to accelerate multimedia applications, such as video compression, decompression, and image processing. These applications typically use 8-bit or 16-bit data therefore, each nano processor has a 16-bit datapath that is much wider than those of other reconfigurable coprocessors. We have developed a programming environment for REMARC and several realistic application programs, DES encryption, MPEG-2 decoding, and MPEG-2 encoding. REMARC can implement various parallel algorithms which appear in these multimedia applications. For instance, REMARC can implement SIMD type instructions similar to multimedia instruction extensions for motion compensation of the MPEG-2 decoding. Furthermore, the highly pipelined algorithms, like systolic algorithms, which appear in motion estimation of the MPEG-2 encoding can also be implemented efficiently. REMARC achieves speedups ranging from a factor of 2.3 to 21.2 over the base processor which is a single issue processor or 2-issue superscalar processor. We also compare its performance with multimedia instruction extensions. Using more processing resources, REMARC can achieve higher performance than multimedia instruction extensions.
Takashi MORIE Jun FUNAKOSHI Makoto NAGATA Atsushi IWATA
This paper presents a neural circuit using PWM technique based on an analog-digital merged circuit architecture. Some new PWM circuit techniques are proposed. A bipolar-weighted summation circuit is described which attains 8-bit precision in SPICE simulation at 5 V supply voltage by compensating parasitic capacitance effects. A high performance differential-type latch comparator which can discriminate 1 mV difference at 100 MHz in SPICE simulation is also described. Next, we present a prototype chip fabricated using a 0.6µm CMOS process. The measurement results demonstrate that the overall precision in the weighted summation and the sigmoidal transformation is 5 bits. A neural network has been constructed using the prototype chips, and the experimental results for realizing the XOR function have successfully verified the basic neural operation.
This paper describes low-power architecture-methodologies for programmable multimedia processors, which will become major functional units in System-On-a-Chip. After brief review on multimedia processing and low-power considerations, recent programmable chips, including MPUs and DSPs, are investigated in terms of low-power implementation. In order to show the difference of the low-power approaches between programmable processors and ASIC processors, a single-chip MPEG-2 encoder is also included as an example of ASIC design.
Hidetoshi TANAKA Shigeo SATO Koji NAKAJIMA
A chaotic noise is one of the most important implements for information processing such as neural networks. It has been suggested that chaotic neural networks have high performance ability for information processing. In this paper, we report two designs of a compact chaotic noise generator for large integration circuits using CMOS technology. The chaotic noise is generated using map chaos. We design both of the logistic map type and the tent map type circuits. These chaotic noise generators are compact as compared with the other circuits. The results show that the successful chaotic operations of the circuits because of the positive Lyapunov number. We calculate the Lyapunov exponents to certify the results of the chaotic operations. However, it is hard to estimate its accurate number for noisy data using the conventional method. And hence, we propose the modified calculation of the Lyapunov exponent for noisy data. These two circuits are expected to be utilized for various applications.
BUDIARTO Kaname HARUMOTO Masahiko TSUKAMOTO Shojiro NISHIO
Recently, mobile computing has received much attention from database community. Sharing information among mobile users is one of the most challenging issues in mobile computing due to user mobility. Replication is a promising technique to this issue. However, adopting replication into mobile computing is a non-trivial task, since we are still facing other problems such as the lack in disk capacity and wireless network bandwidth used by mobile users. We have proposed a dynamic replica allocation strategy called User Majority Replica Allocation (UMRA) that is well suited to the modern architecture of mobile computing environment while avoiding such problems mentioned above. In this paper, we propose two relocation decision policies for UMRA and we provide a cost analysis for them. We also provide a cost analysis for another replica allocation strategy called Static Replica Allocation (SRA) for a comparison purpose.
Masanori HASHIMOTO Hidetoshi ONODERA Keikichi TAMARU
We present a method for power and delay optimization by input reordering. We observe that the reordering has a significant effect on the power dissipation of the gate which drives the reordered gate. This is because the input capacitance depends on the signal values of other inputs. This property, however, has not been utilized for power reduction. Previous approaches focus on the reduction of the power dissipated by internal capacitances of the reordered gate. We propose a heuristic algorithm considering the total power consumed in the driving gate and the reordered gate. Experimental results using 30 benchmark circuits show that our method reduces the power dissipation in all the circuits by 5.9% on average. There is a possibility that power dissipation is reduced by 22.5% maximum. In the case of delay and power optimization, our method reduces delay by 6.7% and power dissipation by 5.3% on average.
Toshio HASEGAWA Junko NAKAJIMA Mitsuru MATSUI
Recently the study and implementation of elliptic curve cryptosystems (ECC) have developed rapidly and its achievements have become a center of attraction. ECC has the advantage of high-speed processing in software even on restricted environments such as smart cards. In this paper, we concentrate on complete software implementation of ECC over a prime field on a 16-bit microcomputer M16C (10 MHz). We propose a new type of prime characteristic of base field suitable for small and fast implementation, and also improve basic elliptic arithmetic formulas. We report a small and fast software implementation of a cryptographic library which supports 160-bit elliptic curve DSA (ECDSA) signature generation, verification and SHA-1 on the processor. This library also includes general integer arithmetic routines for applicability to other cryptographic algorithms. We successfully implemented the library in 4 Kbyte code/data size including SHA-1, and confirmed a speed of 150 msec for generating an ECDSA signature and 630 msec for verifying an ECDSA signature on M16C.
Norio MATSUFURU Kouji NISHIMURA Reiji AIBARA
We study resource allocation strategies in ATM switches, which provide quality of service (QoS) guarantees to individual connections. In order to minimize the cell loss rate over a wide range of traffic characteristics, an efficient allocation strategy is necessary. In this paper we introduce a resource allocation strategy, named TP+WRR (Threshold Pushout + Weighted Round Robin) which can fully utilize the buffer space and the bandwidth. We compare the performance of TP+WRR with two typical resource allocation strategies. An exact queueing analysis based on a Markov model is carried out under bursty traffic sources to evaluate their performance. Our results reveal that TP+WRR considerably improves the cell loss probability over the other strategies considered in this paper, especially when many connections are sharing a link.
Kouji WADA Yasuo IWAMOTO Ikuo AWAI
Basic characteristics of a short-ended coplanar waveguide (CPW) resonator of good spurious suppression property is studied. The resonator is loaded with open tubs at its middle position and makes a fully planar structure. The length of the resonator is shortened almost by half and the first spurious resonance goes up to more than 3 times of the fundamental resonant frequency without degradation of unloaded Q(Q0). The origin and property of spurious response is thoroughly investigated to show the advantage and the limit of this configuration. The external Q(Qe) and fundamental resonant frequency of the resonator are also clarified theoretically and experimentally. Using those result, a bandpass filter (BPF) is designed on the basis of the narrow band approximation is realized and its transmission characteristics are examined theoretically and experimentally. The spurious suppression characteristics have been realized by the present filter in accordance with the expectation.