Hiroshi MATSUOKA Kazuaki OKAMOTO Hideo HIRONO Mitsuhisa SATO Takashi YOKOTA Shuichi SAKAI
In this paper we describe the pipeline design and enhanced hardware for fast message handling in a RICA-1 processor, a processing element (PE) in the RWC-1 multiprocessor. The RWC-1 is based on the reduced inter-processor communication architecture (RICA), in which communications are combined with computation in the processor pipeline. The pipeline is enhanced with hardware mechanisms to support fine-grain parallel execution. The data paths of the RICA-1 super-scalar processor are commonly used for communication as well as instruction execution to minimize its implementation cost. A 128-PE system has been built on January 1998, and it is currently used for hardware debugging, software development and performance evaluation.
Eko Fajar NURPRASETYO Akihiko INOUE Hiroyuki TOMIYAMA Hiroto YASUURA
In the design of an embedded system, an architecture of core processor strongly affects the performance and cost of the total system. This paper discusses a scalable processor architecture, called soft-core processor, which can be tuned for a target system. System designers can optimize several design parameters such as the datapath width and instruction set, and generate customized processors for their application. Design of Bung-DLX as a prototype of soft-core processor is presented in this paper. An experiment of system design using our processor has shown that the optimized processor chip area halves when the critical path delay is reduced to one third of the original one.
Yasumasa SUZAKI Satoru SEKINE Yasuhiro SUZUKI Hiromu TOBA
We demonstrate a very simple and compact optical transceiver diode module using a passive alignment on a silicon bench with a V-groove. The excess loss caused by the passive alignment of an optical transceiver diode and a flat-end optical fiber is only 0. 6 dB. A high coupling efficiency of -4. 3 dB is obtained. This results in a high responsivity with a wavelength- and polarization-independence of 0. 5 dB over a 70 nm wavelength range and in good laser performance.
In order to improve microprocessor performance, we propose to utilize histories of dynamic instruction sequences. A lot of special purpose memories integrated in a processor chip hold the histories. In this paper, we describe the usefulness of using two special purpose memories: Non-Consecutive basic block Buffer (NCB) and Reference Prediction Table (RPT). The NCB improves instruction fetching efficiency in order to relieve control dependences. The RPT predicts data addresses in order to speculate data dependences. From the simulation study, it has been found that the proposed mechanisms improve processor performance by up to 49. 2%.
Kouichi NAGAMI Kiyoshi OGURI Tsunemichi SHIOZAWA Hideyuki ITO Ryusuke KONISHI
We propose an architectural reference of programmable devices that we call Plastic Cell Architecture (PCA). PCA is a reference for implementing a device with autonomous reconfigurability, which we also introduce in this paper. This reconfigurability is a further step toward new reconfigurable computing, which introduces variable- and programmable-grained parallelism to wired logic computing. This computing follows the Object-Oriented paradigm: it regards configured circuits as objects. These objects will be described in a new hardware description language dealing with the semantics of dynamic module instantiation. PCA is the fusion of SRAM-based FPGAs and cellular automata (CA), where the CA are dedicated to support run time activities of objects. This paper mainly focus on autonomous reconfigurability and PCA. The following discussions examine a research direction towards general-purpose reconfigurable computing.
Phongsuphap SUKANYA Ryo TAKAMATSU Makoto SATO
In this paper, we propose a new approach for describing image patterns. We integrate the concepts of multiscale image analysis, aura matrix (Gibbs random fields and cooccurrences related statistical model of texture analysis) to define image features, and to obtain the features having robustness with illumination variations and shading effects, we analyse images based on the Topographic Structure described by the Surface-Shape Operator, which describe gray-level image patterns in terms of 3D shapes instead of intensity values. Then, we illustrate usefulness of the proposed features with texture classifications. Results show that the proposed features extracted from multiscale images work much better than those from a single scale image, and confirm that the proposed features have robustness with illumination and shading variations. By comparisons with the MRSAR (Multiresolution Simultaneous Autoregressive) features using Mahalanobis distance and Euclidean distance, the proposed multiscale features give better performances for classifying the entire Brodatz textures: 112 categories, 2016 samples having various brightness in each category.
Gil-Yoon KIM Yunju BAEK Heung-Kyu LEE
In this paper, we give a solution to the problem of conflict-free access of various slices of data in parallel processor for image processing. Image processing operations require a memory system that permits parallel and conflict-free access of rows, columns, forward diagonals, backward diagonals, and blocks of two-dimensional image array for an arbitrary location. Linear skewing schemes are useful methods for those requirements, but these schemes require complex Euclidean division by prime number. On the contrary, nonlinear skewing schemes such as XOR-schemes have more advantages than the linear ones in address generation, but these schemes allow conflict-free access of some array slices in restricted region. In this paper, we propose a new XOR-scheme which allows conflict-free access of arbitrarily located various slices of data for image processing, with a two-fold the number of memory modules than that of processing elements. Further, we propose an efficient data alignment network which consists of log N + 2-stage multistage interconnection network utilizing Omega network.
The new technique for reducing the load latency is presented. This technique, named tunneling-load, utilizes the register specifier buffer in order to reduce the load latency without fetching the data cache speculatively, and thus eliminates the drawback of any load address prediction techniques. As a consequence of the trend toward increasing clock frequency, the internal cache is no longer able to fill the speed gap between the processor and the external memory, and the data cache latency degrades the processor performance. In order to hide this latency, several techniques predicting the load address have been proposed. These techniques carry out the speculative data cache fetching, which causes the explosion of the memory traffic and the pollution of the data cache. The tunneling-load solves these problems. We have evaluated the effects of the tunneling-load, and found that in an in-order-issue superscalar platform the instruction level parallelism is increased by approximately 10%.
Kazushi MIMURA Masato OKADA Koji KURATA
In this paper, dependence of storage capacity of an analogue associative memory model using nonmonotonic neurons on static synaptic noise and static threshold noise is shown. This dependence is analytically calculated by means of the self-consistent signal-to-noise analysis (SCSNA) proposed by Shiino and Fukai. It is known that the storage capacity of an associative memory model can be improved markedly by replacing the usual sigmoid neurons with nonmonotonic ones, and the Hopfield model has theoretically been shown to be fairly robust against introducing the static synaptic noise. In this paper, it is shown that when the monotonicity of neuron is high, the storage capacity decreases rapidly according to an increase of the static synaptic noise. It is also shown that the reduction of the storage capacity is more sensitive to an increase in the static threshold noise than to the increase in the static synaptic noise.
Macroscopic method for quantization of the evanescent fields brought about by total reflection is presented. Here, a semi-infinite space is assumed to be filled with a transparent dispersive dielectric with dielectric constant ε(ω) to the left of the plane z = 0, and be empty to the right of the plane. The wave is assumed to be incident from the left, and so the whole field is composed of the triplet of incident, reflected, and transmitted waves labeled by a continuous wave vector index. The transmitted wave in free space may be evanescent. The triplet is shown exactly without using slowly varying field approximation in dispersive medium to form orthogonal mode for different wave vectors, which provides the basis for the quantization of the triplet with taken into account of medium dispersion. The exact orthogonal relation reduces to the well known one if the dielectric is nondispersive, ε/ω = 0. By using the field expansion in terms of the orthogonal triplet modes, the total field energy is found to be the sum of the energies of independent harmonic oscillators. A discussion is also made on the wave momentum of evanescent field.
This paper describes a classification method for rotated and scaled textured images using invariant parameters based on spectral-moments. Although it is well known that rotation invariants can be derived from moments of grey-level images, the use is limited to binary images because of its computational unstableness. In order to overcome this drawback, we use power spectrum instead of the grey levels to compute moments and adjust the integral region of moment evaluation to the change of scale. Rotation and scale invariants are obtained as the ratios of the different rotation invariants on the basis of a spectral-moment property with respect to scale. The effectiveness of the approach is illustrated through experiments on natural textures from the Brodatz album. In addition, the stability of the invariants with respect to the change of scale is discussed theoretically and confirmed experimentally.
Tomohiro TAMURA Masaki KATO Toshiyuki YOSHIDA Akinori NISHIHARA
This paper discusses a design technique for multidimensional (M-D) multirate filters which cause no checkerboard distortion. In the first part of this paper, a necessary and sufficient condition for M-D multirate filters to be checkerboard-distortion-free is derived in the frequency domain. Then, in the second part, this result is applied to a scanning line conversion system for television signals. To confirm the effectiveness of the derived condition, band-limiting filters with and without considering the condition are designed, and the results by these filters are compared. A reducibility of the number of delay elements in such a system is also considered to derive efficient implementation.
The paper obtains an algorithm to estimate the irregular sampling in wavelet subspaces. Compared to our former work on the problem, the new estimate is relaxed for some wavelet subspaces.
Min Joon LEE Iickho SONG Suk Chan KIM Hyung-Myung KIM
The phase and frequency commands of a rotating radar system, that utilizes the frequency scanning and phase shifters to steer the beam in the azimuth and elevation directions, respectively, are derived in terms of the angles of the ground based coordinate system. The frequency equation derived is approximated to a simple form to reduce the calculation time for real time multi-function radar systems. It is shown that the approximate frequency commands are in good agreement with the exact ones if the range of the azimuth scanning is not too wide.
Akio ICHIKAWA Takashi TSUSHIMA Toshiyuki YOSHIDA Yoshinori SAKAI
This paper proposes a bitstream scaling technique for MPEG video for the purpose of media synchronizations. The proposed scaling technique can reduce the frame rate as well as the bit rate of an MPEG data sequence to fit them to the values specified by a synchronization system. The advantage of the proposed technique over existing scaling methods is that it is considering not only the performance of synchronization but also the picture quality of the resulting sequences. To further improve the quality of sequences scaled by the proposed method, this paper also proposes an MPEG encoding technique which sets some of the parameters suitable for the scaling. An experiment using these techniques in an actual media synchronization system has illustrated the usefulness of the proposed approach.
Makoto NAKASHIZUKA Yuji HIURA Hisakazu KIKUCHI Ikuo ISHII
We introduce an image contour clustering method based on a multiscale image representation and its application to image compression. Multiscale gradient planes are obtained from the mean squared sum of 2D wavelet transform of an image. The decay on the multiscale gradient planes across scales depends on the Lipshitz exponent. Since the Lipshitz exponent indicates the spatial differentiability of an image, the multiscale gradient planes represent smoothness or sharpness around edges on image contours. We apply vector quatization to the multiscale gradient planes at contours, and cluster the contours in terms of represntative vectors in VQ. Since the multiscale gradient planes indicate the Lipshitz exponents, the image contours are clustered according to its gradients and Lipshitz exponents. Moreover, we present an image recovery algorithm to the multiscale gradient planes, and we achieve the skech-based image compression by the vector quantization on the multiscale gradient planes.
Secret sharing schemes are good for protecting the important secrets. They are, however, inefficient if the secret shadow held by the shadowholder cannot be reused after recovering the shared secret. Traditionally, the (t, n) secret sharing scheme can be used only once, where t is the threshold value and n is the number of participants. To improve the efficiency, we propose an efficient dynamic secret sharing scheme. In the new scheme, each shadowholder holds a secret key and the corresponding public key. The secret shadow is constructed from the secret key in our scheme, while in previously proposed secret sharing schemes the secret key is the shadow. In addition, the shadow is not constructed by the shadowholder unless it is necessary, and no secure delivery channel is needed. Morever, this paper will further discuss how to change the shared secret, the threshold policy and cheater detection. Therefore, this scheme provides an efficient way to maintain important secrets.
Video-on-Demand (VOD)servers are becoming feasible. These servers are a building component in a heterogeneous multimedia environment but have voluminous data to store and manage. If only disk-based secondary storage systems are used to store and manage this huge amount of data the system cost would be extensively high. A tape-based tertiary storage system seems to be a reasonable solution to lowering the cost of storage and management of this continuous data. However, the usage of a tertiary storage system to store large continuous data introduces several issues. These are mainly the replacement policy on disks, the decomposition and the placement of continuous data chunks on tapes, and the scheduling of multiple requests for materializing objects from tapes to disks. In this paper we address these issues and we propose solutions based on some heuristics we experimented in a simulator.
Masamitsu KANEKO Keiichi KANETO
Electrochemomechanical deformation (ECMD) of poly(o-methoxyaniline) (PoMAn) film has been studied in various acid solutions, such as Cl-, HSO4-, BF4-, and p-toluene sulfate. The magnitude of ECMD of the film depends linearly on the degree of oxidation of the film similarly to the case of polyaniline (PAn). 2. 53% of deformation ratios along the stretched direction are obtained for 30% of reduction. In contrast to that of PAn, however, the ECMDs of PoMAn do not markedly depend on the kind of anions. Transient responses of current and deformation are investigated by the potential application stepwise and the diffusion coefficient of ions in films. The results are discussed in terms of the effect of substituted methoxy group.
Toshinori HOSOKAWA Toshihiro HIRAOKA Mitsuyasu OHTA Michiaki MURAOKA Shigeo KUNINOBU
We will present a partial scan design method based on n-fold line-up structures in order to achieve high fault efficiency and reduce test pattern generation time for practical LSIs. We will also present a partial scan design method based on the state justification of pure load/hold FFs in order to achieve high fault efficiency and reduce the number of scan FFs for practical LSIs with lots of load/hold FFs. Experimental results for practical LSIs show that our presented methods can achieve high fault efficiency (more than 99%) and reduce the number of scan FFs for the LSI with lots of load/hold FFs.