Kazuhiro MOTEGI Shigeyoshi WATANABE
For the development of a practical device simulation, it is necessary to solve the large sparse linear equations with a high speed computation of direct solution method. The use of parallel computation methods to solve the linear equations can reduce the CPU time greatly. The Multi Step Diakoptics (MSD) algorithm, is proposed as one of these parallel computation methods with direct solution, which is based on Diakoptics, that is, a tearing-based parallel computation method for sparse linear equations. We have applied the MSD algorithm to device simulation. This letter describes the partition and connection schedules in the MSD algorithm. The evaluation of this algorithm is done using a massively parallel computer with distributed memory (AP1000).
Kyoko TSUKANO Takahiro INOUE Shoichi KOGA Fumio UENO
A new CMOS neuron circuit suitable for VLSI implementation of artificial neural networks is proposed. A cross-coupled current comparator structure is adopted to obtain large differential neuron signals for high-speed multi-input/multi-output neuron operations. In addition, the shape of the output function of the proposed neuron circuit can be modified by simply varying the value of the auxiliary current sources. To estimate the performance of the proposed circuit as an element in a neural network, a 15-bit associative memory based on the Hopfield neural network was designed. The performances of a single 7-input neuron and of the 15-neuron associative memory are confirmed by SPICE simulations.
Kiyoshi NISHIKAWA Russell M. MERSEREAU
We present a successful method for designing 2-D circularly symmetric R lowpass filters with constant group delay. The procedure is based on a transformation of a 1-D prototype R filter with constant group delay, whose magnitude response is the 2-D cross-sectional response. The 2-D filter transfer function has a separable denominator and a numerator which is obtained from the prototype numerator by means of a series of McClellan transformations whose free parameters can be optimized by successive procedure. The method is illustrated by an example.
Based on the Fornasini-Marchesini second model, an efficient algorithm is developed to derive the characteristic polynomial and the inverse of the system matrix from the state-space parameters. As a result, the external description of the Fornasini-Marchesini second model is clarified. A technique for designing 2-D recursive digital filters in the frequency domain is then presented by using the Fornasini-Marchesini second model. The resulting filter approximates both magnitude and group delay specifications and its stability is always guaranteed. Finally, three design examples are given to illustrate the utility of the proposed technique.
Masayuki KAWAMATA Takehiko KAGOSHIMA Tatsuo HIGUCHI
This paper proposes an efficient design method of three-dimensional (3-D) recursive digital filters for video signal processing via decomposition of magnitude specifications. A given magnitude specification of a 3-D digital filter is decomposed into specifications of 1-D digital filters with three different (horizontal, vertical, and temporal) directions. This decomposition can reduce design problems of 3-D digital filters to design problems of 1-D digital filters, which can be designed with ease by conventional methods. Consequently, design of 3-D digital filters can be efficiently performed without complicated tests for stability and large amount of computations. In order to process video signal in real time, the 1-D digital filters with temporal direction must be causal, which is not the case in horizontal and vertical directions. Since the proposed method can approximate negative magnitude specifications obtained by the decomposition with causal 1-D R filters, the 1-D digital filters with temporal direction can be causal. Therefore the 3-D digital filters designed by the proposed method is suitable for real time video signal processing. The designed 3-D digital filters have a parallel separable structure having high parallelism, regularity and modularity, and thus is suitable for high-speed VLSI implementation.
Tsuyosi TAKEBE Masatoshi MURAKAMI Koji HATANAKA Shinya KOBAYASHI
This paper treats the problem of realizing high speed 2-D denominator separable digital filters. Partitioning a 2-D data plane into square blocks, filtering proceeds block by block sequentially. A fast intra-block parallel processing method was developed using block state space realization, which allows simultaneous computation of all the next block states and the outputs of one block. As the block state matrix of the filter has high sparsity, the rows and columns are interchanged respectively to reduce the matrix size. The filter is implemented by a multiprocessor system, where for each matrix's row one processor is assigned to perform the row-column vector multiplication. All processors wirk in synchronized fashion. Number of processors of this implementation are equal to the number of rows of the reduced state matrix and throughput is raised with block lengths.
Somkiat TANGKITVANICH Masamichi SHIMURA
This paper presents a system that automatically refines the theory expressed in the function-free first-order logic. Our system can efficiently correct multiple faults in both the concept and subconcepts of the theory, given only the classified examples of the concept. It can refine larger classes of theory than existing systems can since it has overcome many of their limitations. Our system is based on a new combination of an inductive and an explanation-based learning algorithms, which we call the biggest-first multiple-example EBL (BM-EBL). From a learning perspective, our system is an improvement over the FOIL learning system in that our system can accept a theory as well as examples. An experiment shows that when our system is given a theory that has the classification error rate as high as 50%, it can still learn faster and with more accuracy than when it is not given any theory.
Koichi HAYASHI Mitsuru KOMATSU Masakatsu NISHIGAKI Hideki ASAI
This letter describes the waveform relaxation algorithm with the dynamic circuit partitioning technique based on the operation point of bipolar devices. Finally, we verify its availability for the simulation of the digital bipolar transistor circuit.
Optimal static load balancing problems in open BCMP queueing networks with state-independent arrival and service rates are studied. Their examples include optimal static load balancing in distributed computer systems and static routing in communication networks. We refer to the load balancing policy of minimizing the overall mean response (or sojourn) time of a job as the overall optimal policy. We show the conditions that the solutions of the overall optimal policy satisfy and show that the policy uniquely determines the utilization of each service center, the mean delay for each class and each path class, etc., although the solution, the utilization for each class, the mean delay for all classes at each service center, etc., may not be unique. Then we give tha linear relations that characterize the set whose elements are the optimal solutions, and discuss the condition wherein the overall optimal policy has a unique solution. In parametric analysis and numerical calculation of optimal values of performance variables we must ensure whether they can be uniquely determined.
Toshihide TSUBATA Hiroaki KAWABATA Yoshiaki SHIRAO Masaya HIRATA Toshikuni NAGAHARA Yoshio INAGAKI
This letter describes one neuron's dynamics. This neuron provides its own feedback input. We call this neuron the recurrent neuron and investigate its nonlinear dynamics.
Masayuki KAWAMATA Yasushi IWATA Tatsuo HIGUCHI
This paper designs and evaluates highly parallel VLSI processors for real time 2-D state-space digital filters using hierarchical behavioral description language and synthesizer. The architecture of the 2-D state-space digital filtering system is a linear systolic array of homogeneous VLSI processors, each of which consists of eight processing elements (PEs) executing 1-D state-space digital filtering with multi-input and multi-output. Hierarchical behavioral description language and synthesizer are adopted to design and evaluate PE's and the VLSI processors. One 16 bit fixed-point PE executing a (4, 4)-th order 2-D state-space digital filtering is described on the basis of distributed arithmetic in about 1,200 steps by the description language and is composed of 15 K gates in terms of 2 input NAND gate. One VLSI processor which is a cascade connection of eight PEs is composed of 129 K gates and can be integrated into one 1515 [mm2] VLSI chip using 1 µm CMOS standard cell. The 2-D state-space digital filtering system composed of 128 VLSI processors at 25 MHz clock can execute a 1,0241,024 image in 1.47 [msec] and thus can be applied to real-time conventional video signal processing.
Rinshi SUGINO Yoshiko OKUI Masaki OKUNO Mayumi SHIGENO Yasuhisa SATO Akira OHSAWA Takashi ITO
The mechanism of UV-excited dry cleaning using photoexcited chlorine radicals has been investigated for removing iron and aluminum contamination on a silicon surface. The iron and aluminum contaminants with a surface concentration of 1013 atoms/cm2 were intentionally introduced via an ammonium-hydrogenperoxide solution. The silicon etching rates from the Uv-excited dry cleaning differ depending on the contaminants. Fe and Al can be removed in the same manner. The removal of Fe and Al is highly temperature dependent, and is little affected by the silicon etching depth. Both Fe and Al on the silicon surface were completely removed by UV-excited dry cleaning at a cleaning temperature of 170, and were decreased by two orders of magnitude from the initial level when the surface was etched only 2 nm deep.
We have developed an advanced tool for dimensioning circuit-switched networks, called CNEP (Circuit-Switched Network Evaluation Program) , for effective design of digital networks. CNEP features a high-reliability network structure (node dispersion, double homing, etc) , both-way circuit operation, and circuit modularity (or big module size), all of which are critical for digital networks. CNEP also solves other dimensioning problems such as the cost difference between existing and newly installed circuits, and handles multi-hour traffic conditions, dynamic routing, and multiple-switching-unit nodes. Operations Research techniques are applied to produce exact and heuristic algorithms for these problems. Algorithms with good time-performance trade-off characteristics are chosen for CNEP.
This paper proposes a model for learning non-parametric densities using finite-dimensional parametric densities by applying Yamanishi's stochastic analogue of Valiant's probably approximately correct learning model to density estimation. The goal of our learning model is to find, with high probability, a good parametric approximation of the non-parametric target density with sample size and computation time polynomial in parameters of interest. We use a learning algorithm based on the minimum description length (MDL) principle and derive a new general upper bound on the rate of convergence of the MDL estimator to a true non-parametric density. On the basis of this result, we demonstrate polynomial-sample-size learnability of classes of non-parametric densities (defined under some smoothness conditions) in terms of exponential families with polynomial bases, and we prove that under some appropriate conditions, the sample complexity of learning them is bounded as O((1/ε)(2r1)/2r1n(2r1)/2r(1/ε)(1/ε)1n(1/δ) for a smoothness parameter r (a positive integer), where ε and δ are respectively accuracy and confidence parameters. Futher, we demonstrate polynomial-time learnability of classes of non-parametric densities (defined under some smoothness conditions) in terms of histogram densities with equal-length cells, and we prove that under some appropriate condition, the sample complexity of learning them is bounded as O((1/ε)3/21n3/2(1/ε)(1/ε)1n(1/δ)).
A formula for the variations in vertex-potentials caused by an increase of an edge-weight is derived using topological methods. This formula can be expressed in terms of the increase of the weight and the potential differences between two vertices joined by the edge with respect to three ordered vertex-pairs in the original network before the weight is increased.
We investigate the relationship between two different notions of reducibility among prediction (learning) problems within the distribution-free learning model of Valiant (PAC learning model). The notions of reducibility we consider are the analogues for prediction problems of the many-one reducibility and of the Turing reducibility. The former is the notion of prediction preserving reducibility developed by Pitt and Warmuth, and its generalization. Concerning these two notions of reducibility, we show that there exist a pair of prediction problems A and B, whose membership problems are polynomial time solvable, such that A is reducible to B with respect to the Turing reducibility, but not with respect to the prediction preserving reducibility. We show this result by making use of the notion of a class of polynomially sparse variants of a concept representation class. We first show that any class A of polynomially sparse variants of another class B is reducible to B with respect to the Turing reducibility'. We then prove the existence of a prediction problem R and a class R of polynomially sparse variants of R, such that R does not reduce to R with respect to the prediction preserving reducibility.
Yasuhisa HAYASHI Satoshi KONDO Nobuyuki TAKASU Akio OGIHARA Shojiro YONEDA
This study proposes a new training method for hidden Markov model with separate vector quantization (SVQ-HMM) in speech recognition. The proposed method uses the correlation of two different kinds of features: cepstrum and delta-cepstrum. The correlation is used to decrease the number of reestimation for two features thus the total computation time for training models decreases. The proposed method is applied to Japanese language isolated dgit recognition.
Haruyuki HARADA Mitsuru TANAKA Takashi TAKENAKA
This letter discusses the quality improvement of reconstructed images in diffraction tomography. An efficient iterative procedure based on the modified Newton-Kantorovich method and the Gerchberg-Papoulis algorithm is presented. The simulated results demonstrate the property of high-quality reconstruction even for cases where the first-order Born approximation fails.