Somchai KITTICHAIKOONKIT Michitaka KAMEYAMA
In the applications of the fast Fourier transform (FFT) to real-world computation such as robot vision, high-speed processing with small latency is an important issue. In this paper, we propose a linear array processor for the minimum-latency FFT computation. The processor is constructed by identical butterfly elements (BE's). The key concept to minimize the latency is that each BE generates its output data immediately after its input data become available, with 100% utilization of its arithmetic unit. We also introduce the real-valued FFT to perform the complex-valued FFT. We utilize a double linear array structure so that the parallel processing can be realized without communication between the linear arrays. As a result, the hardware amount of a single BE is reduced to half that of conventional designs. The latency of the proposed FFT processor is greatly reduced in comparison with conventional linear array FFT processors.
This paper describes the novel relaxation-based algorithm for the harmonic analysis of nonlinear circuits. First, we present Iterated Spectrum Analysis based on harmonic balance method, where the harmonic balance method is applied to every node independently. As a result, we can avoid dealing with large scale Jacobian matrices and reduce the total simulation time, compared with the conventional method based on Galerkin's procedure or the harmonic balance method. Next, we define the frequency domain latency. Furthermore, we refer to the possibility for exploitation of three types of latency, i.e., relaxation iteration latency, frequency domain latency and Newton iteration latency. And we propose the multirate-sampling technique based on the consideration of the frequency domain latency. Finally, we apply the present technique to the simple analog circuit simulation and verify its availability for the harmonic analysis.
Makoto HONDA Michitaka KAMEYAMA Tatsuo HIGUCHI
The demand for high-speed image processing is obvious in many real-world computations such as robot vision. Not only high throughput but also small latency becomes an important factor of the performance, because of the requirement of frequent visual feedback. In this paper, a high-performance VLSI image processor based on the multiple-valued residue arithmetic circuit is proposed for such applications. Parallelism is hierarchically used to realize the high-performance VLSI image processor. First, spatially parallel architecture that is different from pipeline architecture is considered to reduce the latency. Secondly, residue number arithmetic is introduced. In the residue number arithmetic, data communication between the mod mi arithmetic units is not necessary, so that multiple mod mi arithmetic units can be completely separated to different chips. Therefore, a number of mod mi multiply adders can be implemented on a single VLSI chip based on the modulus-slice concept. Finally, each mod mi arithmetic unit can be effectively implemented in parallel structure using the concept of a pseudoprimitive root and the multiple-valued current-mode circuit technology. Thus, it is made clear that the throughout use of parallelism makes the latency 1/3 in comparison with the ordinary binary implementation.
Masakatsu NISHIGAKI Nobuyuki TANAKA Hideki ASAI
This paper describes a mixed mode circuit simulation by the direct and relaxation-based methods with dynamic network partitioning. For the efficient circuit simulation by the direct method, the algorithms with circuit partitioning and latency technique have been studied. Recently, the hierarchical decomposition and latency and their validities have been researched. Network tearing techniques enable independent analysis of each subnetwork except for the local datum nodes. Therefore, if the local datum nodes are also torn, each subnetwork is separated entirely. Since the network separation is based on relaxation approach, the implementation of the separation technique in the circuit simulation by the direct method corresponds to performing the mixed mode simulation by the direct and relaxation-based methods. In this paper, a dynamic "network separation" technique based on the tightness of the coupling between subnetworks is suggested. Then, by the introduction of dynamic network separation into the simulator SPLIT with hierarchical decomposition and latency, the mixed mode circuit simulator, which selects the direct method or the relaxation method and determines the block size of the latent circuit dynamically and suitably, is constructed.
Masakatsu NISHIGAKI Nobuyuki TANAKA Hideki ASAI
For the efficient circuit simulation by the direct method, network tearing and latency techniques have been studied. This letter describes a circuit simulator SPLIT with hierarchical decomposition and latency. The block size of the latent subcircuit can be determined dynamically in SPLIT. We apply SPLIT to the MOS circuit simulation and verify its availability.