Yoshiharu TOSAKA Kunihiro SUZUKI Shigeo SATOH Toshihiro SUGII
The effects of α-particle-induced parasitic bipolar current on soft errors in submicron 6-transistor SOI SRAMs were numericaly studied. It was shown that the bipolar current induces soft errors and that there exists a critical quantity which determines the soft error occurrence in the SOI SRAMs. Simulated soft error rates were in the same order as those for bulk SRAMs.
Han-Kyu LEE Jae-Yeal NAM Jin-Soo CHOI Yeong-Ho HA
Full-search block-matching motion estimation is a popular method to reduce temporal redundancies in video sequence. Due to its excessive computational load, parallel processing architectures are often required for real-time processing. One of the architectures is Hsieh's architecture based on systolic array processor and shift register arrays. Serial input characteristic of his scheme can reduce the number of pixel inputs to one, at the expense of significantly increasing the initialization time. This paper presents a modified and generalized Hsieh's architecture to reduce the initialization time. The proposed architecture can easily control data flows by rearranging shift register arrays and input-pin counts by using multiplexers on input stage, while retaining good properties of Hsieh's. The proposed architecture has the following advantages: (1) it allows controllable data inputs to save the pin counts, (2) it is flexible to the dimensional change of the search area via simple control, (3) it can operate in real time for video conference applications, and (4) it has simple and modular structure which is quite suitable for VLSI implementation. For verification of the proposed architecture, VHDL simulations are performed and some results are given.
Hiroshi KUMAGAI Kenji NAKAMURA Hiroshi HANADO Ken'ichi OKAMOTO Naoki HOSAKA Noriaki MIYANO Toshiaki KOZU Nobuhiro TAKAHASHI Toshio IGUCHI Hiroshi MIYAUCHI
A new airborne rain radar named CAMPR (CRL Airborne Multiparameter precipitation Radar) has been developed for the major purpose of calibrating PR (Precipitation Radar) onboard TRMM (Tropical Rainfall Measuring Mission; scheduled to be launched in 1997) in orbit by observing the same rain with both CAMPR and TRMM satellite. CAMPR operates as a coherent radar at 13.8 GHz, the same frequency as TRMM-PR, and has polarimetric and Doppler capabilities. It is installed on a relatively small aircraft and can scan the antenna over a wide angle range, from the nadir to the near-horizon. These functions have been verified to work well and it is shown that the radar system is accurately calibrated. Examples of measurement data show CAMPR's high capability to extract various quantities relating to precipitation and cloud physics. Before the TRMM launch, CAMPR is being used to obtain TRMM-PR simulation data to help its algorithm development as well as to obtain data concerning precipitation and cloud physics.
Moritoshi YASUNAGA Tatsuo OCHIAI
Neural network hardware using time-shared bus and integer representation architecture has already been fabricated and reported from the design viewpoint. However, nothing related to performance evaluation of hardware has yet been presented. Computation-speed, scalability and learning accuracy of hardware are evaluated theoretically and experimentally using a Back Propagation (BP) algorithm. In addition, a mirror-weight assignment technique is proposed for high-speed computation in the BP. NETTalk, an English-pronunciation-reasoning task, has been chosen as the target application for the BP. In the experiment, recently-developed neuro-hardware based on the above architecture and its parallel programming language are used. An outline of the language is described along with BP programming. Mirror-weight assignment allows maximum speed at 55.0 MCUPS (Million Connections Updated Per Second) using 256 neurons in the hidden-layer (numbers of neurons in input-and output-layers are fixed at 203 and 26 respectively in NETTalk). In addition, if scalability is defined as a function of the number of neurons in the hidden-layer, the machine retains high scalability at 0.5 if such a maximum speed needs to be used. No degradation in learning accuracy occurs when experimental results computed using the neuro-hardware are compared with those obtained by floating-point representation architecture (workstation). The experiment indicates that the present integer representational design of the neuro-hardware is sufficient for NETTalk. Performance has been evaluated theoretically. For evaluation purposes, it is assumed that most of the total execution-time is taken up by bus cycles. On the basis of this assumption, an analytical model of computation-speed and scalability is proposed. Analytical predictions agreed well with experimental results.
Md. Kamrui HASAN Takashi YAHAGI
We present a new method for the identification of time-invariant multichannel autoregressive (AR) processes corrupted by additive white observation noise. The method is based on the Yule-Walker equations and identifies the autoregressive parameters from a finite set of measured data. The input signals to the underlying process are assumed to be unknown. An inverse filtering technique is used to estimate the AR parameters and the observation noise variance, simultaneously. The procedure is iterative. Computer simulation results that demonstrate the performance of the identification method are presented.
Hitoshi MIYATA Makoto OHKI Masaaki OHKITA
For a fuzzy control of manipulated variable so as to match a required output of a plant, tuning of fuzzy rules are necessary. For its purpose, various methods to tune their rules automatically have been proposed. In these method, some of them necessitate much time for its tuning, and the others are lacking in the generalization capability. In the fuzzy control by the steepest descent method, a use of piecewise linear membership functions (MSFs) has been proposed. In this algorithm, MSFs of the premise for each fuzzy rule are tuned having no relation to the other rules. Besides, only the MSFs corresponding to the given input and output data for the learning can be tuned efficiently. Comparing with the conventional triangular form and the Gaussian distribution of MSFs, an expansion of the expressiveness is indicated. As a result, for constructing the inference rules, the training cycles can be reduced in number and the generalization capability to express the behavior of a plant is expansible. An effectiveness of this algorithm is illustrated with an example of a parallel parking of an autonomous mobile robot.
Yoshiaki TOKUNAGA Akiyuki MINAMIDE
We proposed a new thchnique using saw wave modulation light to measure the thermal diffusivity of a transparent adhesive by photoacoustic microscope. In this technique, the time required for the measurement of it can be reduced by one-fifth compared with that of a conventional method.
Kensuke MIYASHITA Yoshihiro TSUJINO Nobuki TOKURA
Processor networks connected by buses have attracted considerable attention. Since a reconfigurable array is more powerful than a PRAM and more practical, it becomes the focus of attention. The Processor Array with Reconfigurable Bus System (PARBS) and the Reconfigurable Multiple Bus Machine (RMBM) are both models of parallel computation based on reconfigurable bus and processor array. The PARBS is a processor array that consists of processors arranged to a 2-dimensional grid with a reconfigurable bus system. The RMBM is also made of processors and reconfigurable bus system, but the processors are located in a row and the number of processors and the number of buses are independent of each other. Four versions of RMBM has been proposed and Extended RMBM (E-RMBM) is regarded as the most powerful one among them. In this paper, we describe that a PARBS of size N M can be simulated in constant time by a E-RMBM of 4NM processors, M + 3 buses and 1 read buffer, and that a E-RMBM of P processors, B buses and D read buffers can be also simulated in constant time by a PARBS of size B P. A PARBS or RMBM that solves a computational problem of size n is polynomially bounded iff the product of the number of processors and buses and red and write ports is O (nc), for some constant c. When a PARBS is polynomially bounded, the E-RMBM which simulates it is also polynomially bounded, and vice versa.
Hiroshi KONDO Shuji TUTUMI Satoshi MIKURIYA
A simple and convenient approach for a radial symmetrical point detection is proposed. In this paper the real part-only synthesis is utilized in order to make an origin symmetric pattern of the original image and to perform automatically the calculation of its autocorrelation for the detection of the symmetry center of the image.
This paper presents a parallel move generation of a Chess machine system for achieving the purpose of reducing the number of move generation cycles. The parallel system is composed of five move generation modules which share the move generating cycles to reduce the time of building a game tree. Simulation results show that the proposed parallel move generation architecture takes about half of the number of move generation cycles to build a game tree that is the same as the one built by a sequential move generation module.
Masaki KUMANOYA Toshiyuki OGAWA Yasuhiro KONISHI Katsumi DOSAKA Kazuhiro SHIMOTORI
Various kinds of new architectures have been proposed to enhance operating performance of the DRAM. This paper reviews these architectures including EDO, SDRAM, RDRAM, EDRAM, and CDRAM. The EDO slightly modifies the output control of the conventional DRAM architecture. Other innovative architectures try to enhance the performance by taking advantage of DRAM's internal multiple bits architecture with internal pipeline, parallel-serial conversion, or static buffers/on-chip cache. A quantitative analysis based on an assumption of wait cycles was made to compare PC system performance with some architectures. The calculation indicated the effectiveness of external or on-chip cache. Future trends cover high-speed I/O interface, unified memory architecture, and system integrated memory. The interface includes limited I/O swing such as HSTL and SSTL to realize more than 100MHz operation. Also, Ramlink and SyncLink are briefly reviewed as candidates for next generation interface. Unified memory architecture attempts to save total memory capacity by combining graphics and main memory. Advanced device technology enables system integration which combine system logic and memory. It suggests one potential direction towards system on a chip in the future.
Kiyotaka ATSUMI Shigeru MASUYAMA
We propose a parallel parsing algorithm based on Earley's method, which works in O(log2n) time using O(n4.752) processors on CREW PRAM. This algorithm runs with less number of precessors compared with previously proposed W. Rytter's algorithm.
We propose a new algorithm for minimizing the number of vertices of an approximate curve by keeping the error within a given bound (min-# problem) with the parallel-strip error criterion. The best existing algorithm which solves this problem has O (n2 log n) time complexity. Our algorithm which uses the Cone Intersection Method does not have an improved time complexity, but does have a high efficiency. In particular, for practical data such as those which represent the boundaries or the skeletons of an object, the new algorithm can solve the min-# problem in nearly O(n2) time.
We define two restricted classes of Boolean circuits by assuming the following conditions on underlying graphs of circuits, and prove, for each class, nonlinear lower bounds on size of circuits computing cyclic shifts:
Hiroyoshi YAMADA Yoshio YAMAGUCHI Masakazu SENGOKU
A new superresolution technique is proposed for high-resolution estimation of the scattering analysis. For complicated multipath propagation environment, it is not enough to estimate only the delay-times of the signals. Some other information should be required to identify the signal path. The proposed method can estimate the frequency characteristic of each signal in addition to its delay-time. One method called modified (Root) MUSIC algorithm is known as a technique that can treat both of the parameters (frequency characteristic and delay-time). However, the method is based on some approximations in the signal decorrelation, that sometimes make problems. Therefore, further modification should be needed to apply the method to the complicated scattering analysis. In this paper, we propose to apply a time-domain null filtering scheme to reduce some of the dominant signal components. It can be shown by a simple experiment that the new technique can enhance estimation accuracy of the frequency characteristic in the Root-MUSIC algorithm.
This paper proposes a mutual exclusion method that is unified for the parallel and distributed systems. The method partially serializes requests into partial queues of requests, which are next totally serialized into a main queue. A request in the main queue is authorized to enter the critical section (CS) when the request receives the privilege token from the previous request in the queue. In the distributed system of N sites that each is a parallel system, mutual exclusion is performed by cooperation of two algorithms based on the same method. The algorithm for the distributed system works on a logical network (that is a directed tree) of S ( N) sites. The algorithm for each site produces a local-main queue of requests. The chunk of requests in the local queue is concatenated at a time to the partial queue of the distributed system. The the cost of mutual exclusion -- the number of intersite messages required per CS entry -- is reduced to O(1) (between 0 and 3).
It is well recognized that the electromagnetic interference due to indirect electrostatic discharge (ESD) is not always proportional to the ESD voltage and also that the lower voltage ESD sometimes causes the more serious failure to high-tech information equipment. In order to theoretically examine the peculiar phenomenon, we propose an analytical approach to model the indirect ESD effect. A source ESD model is given here using the spark resistance presented by Rompe and Weizel. Transient electromagnetic fields due to the ESD event are analyzed, which are compared with the experimental data carefully given by Wilson and Ma. A model experiment for indirect ESD is also conducted to confirm the validity of the ESD model presented here.
Kiyoyasu MARUYAMA Chawalit BENJANGKAPRASERT Nobuaki TAKAHASHI Tsuyoshi TAKEBE
An adaptive algorithm for a single sinusoid detection using IIR bandpass filter with parallel block structure has been proposed by Nishimura et al. However, the algorithm has three problems: First, it has several input frequencies being impossible to converge. Secondly, the convergence rate can not be higher than that of the scalar structure. Finally, it has a large amount of computation. In this paper, a new algorithm is proposed to solve these problems. In addition, a new structure is proposed to reduce the amount of computation, in which the adaptive control signal generator is realized by the paralel block structure. Simulation results are given to illustrate the performance of the proposed algorithm.
Mototaka KAMOSHIDA Hirotomo INUI Toshiyuki OHTA Kunihiko KASAMA
The scaling laws between the design rules and the smallest sizes and numbers of particles capable of causing pattern defects and scrapping dies in semiconductor device manufacturing are described. Simulation with electromagnetic waveguide model indicates the possibility that particles, the sizes of which are of comparable order or even smaller than the wavelength of the lithography irradiation sources, are capable of causing pattern defects. For example, in the future 0.25 µm-design-rule era, the critical sizes of Si, Al, and SiO2 particles are simulated as 120 nm 120 nm, 120 nm 120 nm, and 560 nm 560 nm, respectively, in the case of 0.7 µm-thick chemically-amplified positive photoresist with 47 nm-thick top anti-reflective coating films. Future giga-scale integration era is also predicted.
Nguyen Ngoc BINH Masaharu IMAI Akichika SHIOMI Nobuyuki HIKICHI
This paper proposes a new method to design an optimal pipelined instructions set processor for ASIP development using a formal HW/SW codesign methodology. First, a HW/SW partioning algorithm for selecting an optimal pipelined architecture is outlined. Then, an adaptive detabase approach is presented that enables to enhance the optimality of the design through very accurate estimation of the performance of a pipelined ASIP in the HW/SW partitioning process. The experimental results show that the proposed method is effective and efficient.