Takashi MAEBA Mitsuyoshi SUGAYA Shoji TATSUMI Ken'ichi ABE
This letter presents parallel algorithms for matrix multiplication and the fast Fourier transform (FFT) that are significant problems arising in engineering and scientific applications. The proposed algorithms are designed on a 3-dimensional processor array with separable buses (PASb). We show that a PASb consisting of N N h processors can compute matrix multiplication of size N N and the FFT of size N in O(N/h+log N) time, respectively. In order to examine ease of hardware implementation, we also evaluate the VLSI complexity of the algorithms. A result obtained achieves an optimal bound on area-time complexity when h=O(N/log N).
This paper shows an extensive software performance analysis of dedicated hash functions, particularly concentrating on Pentium III, which is a current dominant processor. The targeted hash functions are MD5, RIPEMD-128 -160, SHA-1 -256 -512 and Whirlpool, which fully cover currently used and future promised hashing algorithms. We try to optimize hashing speed not only by carefully arranging pipeline scheduling but also by processing two or even three message blocks in parallel using MMX registers for 32-bit oriented hash functions. Moreover we thoroughly utilize 64-bit MMX instructions for maximizing performance of 64-bit oriented hash functions, SHA-512 and Whirlpool. To our best knowledge, this paper gives the first detailed measured performance analysis of SHA-256, SHA-512 and Whirlpool.
Hiroaki SUZUKI Tadashi DOHI Hiroyuki OKAMURA
In this paper, we consider the similar software cost models with periodic rejuvenation to Garg, Puliafito, Telek and Trivedi (1995) under the cost effectiveness criteria. First, an alternative model as well as the original one are analyzed by Markov regenerative processes. We derive analytically the optimal periodic software rejuvenation policies which maximize the cost-effectiveness in the steady state for two models. Further, we develop statistical non-parametric algorithms to estimate the optimal software rejuvenation policies, provided that the sample data to characterize the system failure times are given. Then, the total time on test (TTT) concept is used. In numerical examples, we compare the periodic software rejuvenation policy with the non-periodic one, and investigate the asymptotic properties of the non-parametric estimators for the optimal software rejuvenation policies through a simulation experiment.
Qing HAN Keizo INAGAKI Kyouichi IIGUSA Robert SCHLUB Takashi OHIRA Masami AKAIKE
Harmonic distortions of a recently developed lightweight film-type ESPAR (Electronically Steerable Passive Array Radiator) antenna are investigated experimentally. These distortions arise from the nonlinearity of the varactor diodes that are directly integrated with the parasitic radiator elements to control the antenna's radiation pattern. A reactive-near-field measurement technique that employs low-interference probes in an ultra-small anechoic box is used to reduce experimental time and cost. An anti-series varactor pair is introduced and compared with the conventional single varactor. Consequently, an ESPAR antenna equipped with the anti-series varactor pair exhibits remarkable suppression of nonlinear distortion. In particular, the second- and the third-order harmonic is reduced by approximately 20 dB and 12 dB from the level of a single varactor type ESPAR antenna, respectively.
Tomohiro YONEDA Eric MERCER Chris MYERS
This paper develops a modular synthesis algorithm for timed circuits that is dramatically accelerated by partial order reduction. This algorithm synthesizes each module in a hierarchical design individually. It utilizes partial order reduction to reduce the state space explored for the other modules by considering a single interleaving of concurrently enabled transitions. This approach better manages the state explosion problem resulting in a more than 2 order of magnitude reduction in synthesis time. The improved synthesis time enables the synthesis of a larger class of timed circuits than was previously possible.
Takaaki NAKASHIMA Akihiro FUJIWARA
Parallelization of the P-complete problem is known to be difficult. In this paper, we consider the parallelizability of a stack breadth-first search (stack BFS) problem, which is proved to be P-complete. We first propose the longest path length (LPL) as a measure for the P-completeness of the stack BFS. Next, using this measure, we propose an efficient parallel algorithm for the stack BFS. Assuming the size and LPL of an input graph are n and l, respectively, the complexity of the algorithm indicates that the stack BFS is in the class NCk+1 if l = O(logk n), where k is a positive integer. In addition, the algorithm is cost optimal if l=O(nε), where 0 < ε < 1.
In this paper, an associative memory model with a forgetting process proposed by Mezard et al. is investigated as a means of storing sparsely encoded patterns by the SCSNA proposed by Shiino and Fukai. Similar to the case of storing non-sparse (non-biased) patterns as analyzed by Mezard et al., this sparsely encoded associative memory model is also free from a catastrophic deterioration of the memory caused by memory pattern overloading. We theoretically obtain a relationship between the storage capacity and the forgetting rate, and find that there is an optimal forgetting rate leading to the maximum storage capacity. We call this the optimal storage capacity rate. As the memory pattern firing rate decreases, the optimal storage capacity increases and the optimal forgetting rate decreases. Furthermore, we shown that the capacity rate (i.e. the ratio of the storage capacity for the conventional correlation learning rule to the optimal storage capacity) is almost constant with respect to the memory pattern firing rate.
Junyi XU Jian YANG Yingning PENG Chao WANG Yuei-An LIOU
In this paper, a new method is proposed for supervised classification of ground cover types by using polarimetric synthetic aperture radar (SAR) data. The concept of similarity parameter between two scattering matrices is introduced for characterizing target scattering mechanism. Four similarity parameters of each pixel in image are used for classification. They are the similarity parameters between a pixel and a plane, a dihedral, a helix and a wire. The total received power of each pixel is also used since the similarity parameter is independent of the spans of target scattering matrices. The supervised classification is carried out based on the principal component analysis. This analysis is applied to each data set in image in the feature space for getting the corresponding feature transform vector. The inner product of two vectors is used as a distance measure in classification. The classification result of the new scheme is shown and it is compared to the results of principal component analysis with other decomposition coefficients, to demonstrate the effectiveness of the similarity parameters.
Munehiro MATSUURA Tsutomu SASAO Jon T. BUTLER Yukihiro IGUCHI
A shared binary decision diagram (SBDD) represents a multiple-output function, where nodes are shared among BDDs representing the various outputs. A partitioned SBDD consists of two or more SBDDs that share nodes. The separate SBDDs are optimized independently, often resulting in a reduction in the number of nodes over a single SBDD. We show a method for partitioning a single SBDD into two parts that reduces the node count. Among the benchmark functions tested, a node reduction of up to 23% is realized.
The one-sample locally optimum rank detector test statistics for composite signals in multiplicative and signal-dependent noise are obtained. Since the one-sample locally optimum rank detector makes use of the sign statistics of observations as well as the rank statistics, both 'even' and 'odd' score functions have to be considered. Although the one-sample locally optimum rank detector requires two score functions while the two-sample detector requires only one score function, the one-sample detector requires fewer calculations since it has to rank fewer observations.
Masanori SHIMADA Toshimichi SAITO
This paper presents a flexible learning algorithm for the binary neural network that can realize a desired Boolean function. The algorithm determines hidden layer parameters using a genetic algorithm. It can reduce the number of hidden neurons and can suppress parameters dispersion. These advantages are verified by basic numerical experiments.
This paper presents circuit models for the description of the frequency-dependent behavior of coils for horizontal deflection in CRTs. This enables CRT circuit designers to use circuit simulation programs to predict the high-frequency behavior of the interaction between the deflection coils and the drive circuit. An overview is given of the major phenomena that occur in CRT deflection coils at various frequencies. Models are presented for the dissipative, the capacitive, and the resonant behavior in successive frequency intervals. With these models, phenomena such as power dissipation and ringing can not only be related to design parameters, but can also be calculated from impedance characteristics which are relatively easy to measure.
Payman MOALLEM Karim FAEZ Javad HADDADNIA
Finding corresponding edges is considered being the most difficult part of edge-based stereo matching algorithms. Usually, correspondence for a feature point in the first image is obtained by searching in a predefined region of the second image, based on epipolar line and maximum disparity. Reduction of search region can increase performances of the matching process, in the context of execution time and accuracy. Traditionally, hierarchical multiresolution techniques, as the fastest methods are used to decrease the search space and therefore increase the processing speed. Considering maximum of directional derivative of disparity in real scenes, we formulated some relations between maximum search space in the second images with respect to relative displacement of connected edges (as the feature points), in successive scan lines of the first images. Then we proposed a new matching strategy to reduce the search space for edge-based stereo matching algorithms. Afterward, we developed some fast stereo matching algorithms based on the proposed matching strategy and the hierarchical multiresolution techniques. The proposed algorithms have two stages: feature extraction and feature matching. We applied these new algorithms on some stereo images and compared their results with those of some hierarchical multiresolution ones. The execution times of our proposed methods are decreased between 30% to 55%, in the feature matching stage. Moreover, the execution time of the overall algorithms (including the feature extraction and the feature matching) is decreased between 15% to 40% in real scenes. Meanwhile in some cases, the accuracy is increased too. Theoretical investigation and experimental results show that our algorithms have a very good performance with real complex scenes, therefore these new algorithms are very suitable for fast edge-based stereo applications in real scenes like robotic applications.
Bhed Bahadur BISTA Kaoru TAKAHASHI Norio SHIRATORI
In this paper, we consider a flexible method for designing n-entities communication protocols and services. The proposed technique considers alternative and parallel composition of n service specifications and n protocol specifications, where n 2. The specifications are specified in Basic LOTOS which is a Formal Description Technique (FDT). We use the weak bisimulation equivalence () to represent the correctness properties between the service specification and the protocol specification.
Rong-Long WANG Zheng TANG Qi-Ping CAO
When solving combinatorial optimization problems with a binary Hopfield-type neural network, the updating process in neural network is an important step in achieving a solution. In this letter, we propose a new updating procedure in binary Hopfield-type neural network for efficiently solving combinatorial optimization problems. In the new updating procedure, once the neuron is in excitatory state, then its input potential is in positive saturation where the input potential can only be reduced but cannot be increased, and once the neuron is in inhibitory state, then its input potential is in negative saturation where the input potential can only be increased but cannot be reduced. The new updating procedure is evaluated and compared with the original procedure and other improved methods through simulations based on N-Queens problem. The results show that the new updating procedure improves the searching capability of neural networks with shorter computation time. Particularly, the simulation results show that the performance of proposed method surpasses the exiting methods for N-queens problem in synchronous parallel computation model.
Yoichi SAITO Takahiro YAMASAKI Fumio TAKAHATA
This paper presents the transmission performance of a class-IV partial-response signaling SSB system and proposes a method that can improve its power efficiency. A line code that has no dc component has been used in the SSB transmission of digital signals. The type of line code, such as a partial-response signaling, increases the modulation states, and as a result, decreases the power efficiency. To overcome this obstacle, a new demodulation method called "re-filtering and combining" is proposed on the assumption of orthogonal phase detection. The demodulated quadrature channel is re-filtered by a Hilbert filter and is combined with the in-phase channel. It is confirmed by computer simulations that the new demodulation method improves the BER performance and a 3 dB improvement of the power efficiency is obtained.
Hyuk-Jae JANG Masayuki KAWAMATA
This paper proposes a design method of 2-D variable IIR digital filters with high frequency tuning accuracy. In the proposed method, a parallel complex allpass structure is used as the prototype structure of the 2-D variable digital filters in order to obtain low sensitivity characteristic. Because the proposed 2-D variable digital filter is composed of first-order complex allpass sections connected in parallel, the proposed variable digital filter possesses several advantages such as low sensitivity characteristic in the passband, simple stability monitoring and high parallelism. In order to improve the frequency tuning accuracy of the proposed variable digital filter, each first-order complex allpass section is substituted by a new first-order complex allpass section with low sensitivity characteristic. Moreover, the coefficient sensitivity analysis of a 2-D parallel complex allpass structure is presented. Numerical examples show that the proposed 2-D variable IIR digital filter has high tuning accuracy under the finite coefficient wordlength.
Kenichi ICHINO Takeshi ASAKAWA Satoshi FUKUMOTO Kazuhiko IWASAKI Seiji KAJIHARA
An n-detection testing for stuck-at faults can be used not only for delay fault testing but also for detection of unmodeled faults. We have developed a hybrid BIST circuit; that is, a method consisting of a shift register with partial rotation and a procedure that selects test vectors from ATPG ones. This testing method can perform at-speed testing with high stuck-at fault coverage. During the at-speed testing, a subset of the ATPG vectors is input by using a low-speed tester. Computer simulations on ISCAS'85, ISCAS'89, and ITC'99 circuits are conducted for n = 1, 2, 3, 5, 10, and 15. The simulation results show that the amount of test vectors can be reduced to ranging from 52.3% to 0.9% in comparison with that of the ATPG vectors. As a result, the proposed method can reduce the cost of at-speed testing.
In recent years, perpendicular magnetic recording have progressed rapidly. It will not be long before perpendicular magnetic recording is put into practical use. However there have been few tools contributing to the optimum design of perpendicular magnetic recording media and heads except computer simulations. The authors have introduced a simple method based on the concept of self-consistent magnetization to analytically predict a transition parameter in terms of parameters of recording media and writing heads. Moreover we have discussed the origin of media noise by using a time-domain analysis of readout voltage and Voronoi cell model analysis. In this paper, main parameters to realize high bit density recording over 100 Gbit/inch2 is discussed first through these methods, and then the current status, the future problems and the prospects in perpendicular magnetic recording technology are described.
Hernan AGUIRRE Kiyoshi TANAKA Shinjiro OSHITA
In this work we study the performance of a distributed GA that incorporates in its core parallel cooperative-competitive genetic operators. A series of controlled experiments are conducted using various large and difficult 0/1 multiple knapsack problems to test the robustness of the distributed GA. Simulation results verify that the proposed distributed GA compared with a canonical distributed GA significantly gains in search speed and convergence reliability with less communication cost for migration.