IEICE global.ieice.org Site

Keyword Search Result

[Keyword] PAR(2741hit)

2641-2660hit(2741hit)

A Trial on Distance Education and Training through the PARTNERS Network
Masatomo TANAKA

LETTER

Vol:
E76-A No:7
Page(s):
1195-1198
Japan's PARTNERS Project, one of the programmes of ISY advocated by UN, has just started. This letter is a brief introduction of the trials being carried out by the partners in the University of Electro-communications under the Project. The focus is on the distance education and training via ETS-V overcoming the geographical extent and the cultural diversity of the Asia-Pacific Region.
Parameter Estimation of Uniform Image Blur Using DCT
Yasuo YOSHIDA Kazuyoshi HORIIKE Kazuhiro FUJITA

LETTER

Vol:
E76-A No:7
Page(s):
1154-1157
The matrix whose eigenvectors are the basis vectors of the DCT is introduced. This matrix leads to a convolution-product property using the DCT. Based on the property, the parameter of uniform blur, such as motion blur or out-of-focus blur, is estimated from the local minima of the DCT energy spectrum of a blurred image. Computer experiments confirmed that the DCT is superior to the DFT for estimating the parameter.
A Bit-Parallel Block-Parallel Functional Memory Type Parallel Processor Architecture
Kazutoshi KOBAYASHI Keikichi TAMARU Hiroto YASUURA Hidetoshi ONODERA

PAPER-Memory-Based Parallel Processor Architectures

Vol:
E76-C No:7
Page(s):
1151-1158
We propose a new architecture of Functional Memory type Parallel Processor (FMPP) architectures called bit-parallel block-parallel (BPBP) FMPP. Design details of a prototype BPBP FMPP chip are also shown. FMPP is a massively parallel processor architecture that has a memory-based simple two-dimensional regular array structure suitable for memory VLSI technology. Computation space increases as integration density of memory increases. Computation time does not depend on the number of processors. So far, a bit-serial word-parallel (BSWP) implementation based on a content addressable memory (CAM) is mainly investigated as one of promising architectures of FMPP. In a BSWP FMPP, each word of a CAM works as a processor, and the amount of hardware is minimized by abopting a bit-serial operation, thus maximizing integration scale. The BSWP FMPP, however, does not allow operations between two words, which restriction limits the applicability of the BSWP FMPP. On the other hand, the proposed BPBP FMPP is designed to execute logical and arithmetic operations on two words. These operations are performed simultaneously on every group of words called a block. BPBP FMPP hereby achieves a high performance while maintaining high integration density of the BSWP, and is suitable for various applications.
Hardware Architecture for Kohonen Network
Hidetoshi ONODERA Kiyoshi TAKESHITA Keikichi TAMARU

PAPER-Neural Networks and Chips

Vol:
E76-C No:7
Page(s):
1159-1166
We propose a fully digital architecture for Kohonen network suitable for VLSI implementation. The proposed architecture adopts a functional memory type parallel processor (FMPP) architecture which has a structure similar to a content addressable memory (CAM). One word of CAM is regarded as a processing element and a group of elements forms a neuron. All processing elements execute the same operation in bit-serial but in processor-parallel. Thus the number of instructions for realizing the network algorithm is independent of the number of neurons in the network. With reference to a previously reported CAM, we estimate a network with 96 neurons for speech recognition could be integrated on three chips using a 1.2 µm process, and it operates 50 times faster than a sequential hardware. Owing to its highly regular structure of memories, the proposed hardware architecture is well compatible with current VLSI technology.
Three Dimensional Optical Interconnection Technology for Massively-Parallel Computing Systems
Kazuo KYUMA Shuichi TAI

INVITED PAPER

Vol:
E76-C No:7
Page(s):
1070-1079
Three dimensional (3-D) optics offers potential advantages to the massively-parallel systems over electronics from the view point of information transfer. The purpose of this paper is to survey some aspects of the 3-D optical interconnection technology for the future massively-parallel computing systems. At first, the state-of-art of the current optoelectronic array devices to build the interconnection networks are described, with emphasis on those based on the semiconductor technology. Next, the principles, basic architectures, several examples of the 3-D optical interconnection systems in neural networks and multiprocessor systems are described. Finally, the issues that are needed to be solved for putting such technology into practical use are summarized.
A High Speed, Switched-Capacitor Analog-to-Digital Converter Using Unity-Gain Buffers
Satomi OGAWA Kenzo WATANABE

PAPER-Methods and Circuits for Signal Processing

Vol:
E76-A No:6
Page(s):
924-930
A cyclic analog-to-digital (A/D) converter is developed which accomplishes an n-b conversion in n/2 clock cycles. The architecture consists of two 1-b quantizers connected in a loop. A CMOS design of the 1-b quantizer is given to evaluate the performance of the A/D converter when implemented using presently available process. Spice simulations and error analyses show that a resolution higher than 10-b and a sampling rate up to 1.4 Msps are attainable with a 3-µm CMOS process. A prototype converter breadboarded using discrete components has confirmed the principles of operation and error analyses. The device count and the power consumption are small compared to those of a successive-approximation A/D converter. A chip area required for the CMOS implementation is also small because only four unit capacitors are involved. Therefore, the architecture proposed herein is most suited for high accuracy, medium speed A/D conversion.
Unified Scheduling of High Performance Parallel VLSI Processors for Robotics
Bumchul KIM Michitaka KAMEYAMA Tatsuo HIGUCHI

PAPER-Parallel Processor Scheduling

Vol:
E76-A No:6
Page(s):
904-910
The performance of processing elements can be improved by the progress of VLSI circuit technology, while the communication overhead can not be negligible in parallel processing system. This paper presents a unified scheduling that allocates tasks having different task processing times in multiple processing elements. The objective function is formulated to measure communication time between processing elements. By employing constraint conditions, the scheduling efficiently generates an optimal solution using an integer programming so that minimum communication time can be achieved. We also propose a VLSI processor for robotics whose latency is very small. In the VLSI processor, the data transfer between two processing elements can be done very quickly, so that the communication cycle time is greatly reduced.
Overlapped Partitioning Algorithm for the Solution of LSEs with Fixed Size Processor Array
Ben CHEN Mahoki ONODA

PAPER-Modeling and Simulation

Vol:
E76-A No:6
Page(s):
1011-1018
In this paper we present an Overlapped Block Gauss-Seidel (OBGS) algorithm for the solution of large scale LSEs (Linear System of Equations) based on array architecture which we have already proposed. Better partitioning for processor array usually requires (1) balanced block size, and (2) minimum coupling between blocks for better convergence. These conditions can well be satisfied by overlapping some variables in computation algorithm. The mathematical implication of overlapped partitioning is discussed at first, and some examples show the effectiveness of OBGS algorithm. Conclusion points out that the convergence properties can well be improved by proper choice of overlapped variables. An efficient algorithm is given for choosing block and variables in order to realize above conditions.
Cancellation Technique of Parasitics in Active Filter Design
Takao TSUKUTAKI Masaru ISHIDA Yutaka FUKUI

LETTER-Methods and Circuits for Signal Processing

Vol:
E76-A No:6
Page(s):
957-960
This letter presents a technique to cancel the parasitic effects of operational amplifier (op amp) in active filter design. To minimize the effects, an op amp model considering the parasitics (i.e. both parasitic poles and zeros) is utilized. It is shown that undesirable factors in the transfer function due to the parasitics can be canceled well by predistorting the passive element values of the circuit. As an example, an active-R highpass filter is evaluated both theoretically and numerically. In this way, the proposed technique can be effectively incorporated into the design of active filters.
Behavior of Solutions Related to an Accuracy Exp(-1/ε)
Makoto ITOH

PAPER-Nonlinear Circuits and Neural Nets

Vol:
E76-A No:6
Page(s):
867-872
Behavior of solutions related to an accuracy exp(-1/ε) is studied. Computer results are given, and examined from the view-point of non-standard analysis. The experimental results raise some important questions on the computer study of slow-fast systems.
A Hardware Architecture Design Methodology for Hidden Markov Model Based Recognition Systems Using Parallel Processing
Jun-ichi TAKAHASHI

PAPER-Digital Signal Processing

Vol:
E76-A No:6
Page(s):
990-1000
This paper presents a hardware architecture design methodology for hidden markov model based recognition systems. With the aim of realizing more advanced and user-friendly systems, an effective architecture has been studied not only for decoding, but also learning to make it possible for the system to adapt itself to the user. Considering real-time decoding and the efficient learning procedures, a bi-directional ring array processor is proposed, that can handle various kinds of data and perform a large number of computations efficiently using parallel processing. With the array architecture, HMM sub-algorithms, the forward-backward and Baum-Welch algorithms for learning and the Viterbi algorithm for decoding, can be performed in a highly parallel manner. The indispensable HMM implementation techniques of scaling, smoothing, and estimation for multiple observations can be also carried out in the array without disturbing the regularity of parallel processing. Based on the array processor, we propose the configuration of a system that can realize all HMM processes including vector quantization. This paper also describes that a high PE utilization efficiency of about 70% to 90% can be achieved for a practical left-to-right type HMMs.
A GaAs Monolithic Sampling Phase Frequency Comparator for Extending the Pull-In Range of Microwave Phase-Locked Oscillators
Tadao NAKAGAWA Tetsuo HIROTA Takashi OHIRA

PAPER

Vol:
E76-C No:6
Page(s):
944-949
A novel sampling comparator circuit is presented for extending the pull-in range of microwave phase-locked oscillators (PLOs). It performs both phase and frequency detection without any frequency dividers, and a GaAs MMIC prototype is developed and tested. The proposed comparator improves the pull-in range by about 10 times more than is possible with conventional sampling phase detectors.
RHINE: Reconfigurable Multiprocessor System for Video CODEC
Yoshinori TAKEUCHI Zhao-Chen HUANG Masatomo SAEKI Hiroaki KUNIEDA

PAPER-Methods and Circuits for Signal Processing

Vol:
E76-A No:6
Page(s):
947-956
This paper introduces the new application specific architecture RHINE (Reconfigurable Hierarchical Image Neo-multiprocessor Engine) that is a multiprocessor system for moving picture CODEC. The array processor is known to be originally suited for data parallel processing such as image signal processing which requires vast amount of computations and has the identical instruction sequences on data. However, the moving picture CODEC algorithm suffers from the large load imbalance in the processings on multi-processors with the separated sub-images. Some load balancing techniques are indispensable in such applications for the highest speed-up. RHINE gives one of the optimal solutions for such a load balancing due to its feature of the self reconfigurable architecture. RHINE consists of Block Processing Units (BPU) hierarchically, in each of which has a common bus architecture of multiprocessors with a block memory. Processors in a BPU move to the other BPU according to the load imbalance between BPUs by switching the bus connection between BPUs. The advantage of RHINE architecture is demonstrated by showing performance simulations for real moving pictures.
Parallel Viterbi Decoding Implementation by Multi-Microprocessors
Hui ZHAO Xiaokang YUAN Toru SATO Iwane KIMURA

PAPER-Communication Theory

Vol:
E76-B No:6
Page(s):
658-666
The Viterbi algorithm is a well-established technique for channel and source decoding in high performance digital communication systems. However, excessive time consumption makes it difficult to design an efficient high-speed decoder for practical application. This paper describes the implementation of parallel Viterbi algorithm by multi-microprocessors. Internal computations are performed in a parallel fashion. The use of microprocessors allows low-cost implementation with moderate complexity. The software and hardware implementations of the Viterbi algorithm on parallel multi-microprocessors for real-time decoding are presented. The implemented method is based on a combination of forming a set of tables and calculations. For efficient operation under fully parallel Viterbi decoding by microprocessors, we considered: (1) branch metrics processing, path metrics updating, path memory updating and decoding output for microprocessor, (2) efficient decomposition of the sequential Viterbi algorithm into parallel algorithms, (3) minimization of the communication among the microprocessors. The practical solutions for the problems of synchronization among the miroprocessors, interconnection network for communication among the microprocessors and memory management are discussed. Furthermore the performance and the speed of the parallel Viterbi decoding are given. For a fixed processing speed of given hardwares, parallel Viterbi decoding allows a linear speed up in the throughput rate with a linear increase in hardware complexity.
A 10-b 300-MHz Interpolated-Parallel A/D Converter
Hiroshi KIMURA Akira MATSUZAWA Takashi NAKAMURA Shigeki SAWADA

PAPER

Vol:
E76-C No:5
Page(s):
778-786
This paper describes a monolithic 10-b A/D converter that realized a maximum conversion frequency of 300 MHz. Through the development of the interpolated-parallel scheme, the severe requirement for the transistor Vbe matching can be alleviated drastically, which improves differential nonlinearity (DNL) significantly to within 0.4 LSB. Furthermore, an extremely small input capacitance of 8 pF can be attained, which translates into better dynamic performance such as SNR of 56 dB and THD of 59 dB for an input frequency of 10 MHz. Additionally, the folded differential logic circuit has been developed to reduce the number of elements, power dissipation, and die area drastically. Consequently, the A/D converter has been implemented as a 9.0 4.2-mm2 chip integrating 36K elements, which consumes 4.0 W using a 1.0-µm-rule, 25-GHz ft, double-polysilicon self-aligned bipolar technology.
Some Properties and a Necessary and Sufficient Condition for Extended Kleene-Stone Logic Functions
Noboru TAKAGI Kyoichi NAKASHIMA Masao MUKAIDONO

PAPER-Logic and Logic Functions

Vol:
E76-D No:5
Page(s):
533-539
Recently, fuzzy logic which is a kind of infinite multiple-valued logic has been studied to treat certain ambiguities, and its algebraic properties have been studied by the name of fuzzy logic functions. In order to treat modality (necessity, possibility) in fuzzy logic, which is an important concept of multiple-valued logic, the intuitionistic logical negation is required in addition to operations of fuzzy logic. Infinite multiple-valued logic functions introducing the intuitionistic logical negation into fuzzy logic functions are called Kleene-Stone logic functions, and they enable us to treat modality. The domain of modality in which Kleene-Stone logic functions can handle, however, is too limited. We will define α-KS logic functions as infinite multiple-valued logic functions using a unary operation instead of the intuitionistic logical negation of Kleene-Stone logic functions. In α-KS logic functions, modality is closer to our feelings. In this paper we will show some algebraic properties of α-KS logic functions. In particular we prove that any n-variable α-KS logic function is determined uniquely by all inputs of 7 values which are 7 specific truth values of the original infinite truth values. This means that there is a bijection between the set of α-KS logic functions and the set of 7-valued α-KS logic functions which are restriction of α-KS logic functions to 7 specific truth values. Finally, we show a necessary and sufficient condition for a 7-valued logic function to be a 7-valued α-KS logic function.
BiCMOS Circuit Techniques for 3.3 V Microprocessors
Fumio MURABAYASHI Tatsumi YAMAUCHI Masahiro IWAMURA Takashi HOTTA Tetsuo NAKANO Yutaka KOBAYASHI

PAPER

Vol:
E76-C No:5
Page(s):
695-700
With increases in frequency and density of RISC microprocessors due to rapid advances in architecture, circuit and fine device technologies, power consumption becomes a bigger concern. Supply voltage should be reduced from 5 V to 3.3 V. In this paper, several novel circuits using 0.5µm BiCMOS technology are proposed. These can be applied to a superscalar RISC microprocessor at 3.3 V power supply or below. High speed and low power consumption characteristics are achieved in a floating-point data path, an integer data path and a TLB by using the proposed circuits. The three concepts behind the proposed high speed circuit techniques at low voltage are summarized as follows. There are a number of heavy load paths in a microprocessor, and these become critical paths under low voltage conditions. To achieve high speed characteristics under heavy load conditions without increasing circuit area, low voltage swing operation of a circuit is effective. By exploiting the high conductance of a bipolar transistor, instead of using an MOS transistor, low swing operation can be got. This first concept is applied to a single-ended common-base sense circuit with low swing data lines in the register file of a floating and an integer data path. Both multi-series transistor connections and voltage drops by Vth of MOS transistors and Vbe of bipolar transistors also degrade the speed performance of a circuit. Then the second concept employed is a wired-OR logic circuit technique using bipolar transistors which is applied to a comparator in the TLB instead of multi-series transistor connections of CMOS circuits. The third concept to overcome the voltage drops by Vth and Vbe is addition of a pull up PMOS to both the path logic adder and the BiNMOS logic gate to ensure the circuits have full swing operation.
Optical Multiplex Computing Based on Set-Valued Logic and Its Application to Parallel Sorting Networks
Shuichi MAEDA Takafumi AOKI Tatsuo HIGUCHI

PAPER-Optical Logic

Vol:
E76-D No:5
Page(s):
605-615
A new computer architecture using multiwavelength optoelectronic integrated circuits (OEICs) is proposed to attack the problems caused by interconnection complexity. Multiwavelength-OEIC architecures, where various wavelengths are employed as information carriers, provide the wavelength as an extra dimension of freedom for parallel processing, so that we can perform several independent computations in parallel in a single optical module using the wavelength space. This multiplex computing" enables us to reduce the wiring area required by a network and improve their complexity. In this paper, we discuss the efficient multiplexing of Batcher's bitonic sorting networks, highly parallel computing architectures that require global interconnections inherently. A systematic multiplexing of interconnection topology is presented using a binary representation of the connectivities of interconnection paths. It is shown that the wiring area can be reduced by a factor of 1/r2 using r kinds of wavelength components.
Code Assignment Algorithm for Highly Parallel Multiple-Valued Combinational Circuits Based on Partition Theory
Saneaki TAMAKI Michitaka KAMEYAMA Tatsuo HIGUCHI

PAPER-Logic Design

Vol:
E76-D No:5
Page(s):
548-554
Design of locally computable combinational circuits is a very important subject to implement high-speed compact arithmetic and logic circuits in VLSI systems. This paper describes a multiple-valued code assignment algorithm for the locally computable combinational circuits, when a functional specification for a unary operation is given by the mapping relationship between input and output symbols. Partition theory usually used in the design of sequential circuits is effectively employed for the fast search for the code assignment problem. Based on the partition theory, mathematical foundation is derived for the locally computable circuit design. Moreover, for permutation operations, we propose an efficient code assignment algorithm based on closed chain sets to reduce the number of combinations in search procedure. Some examples are shown to demonstrate the usefulness of the algorithm.
Simple Quotient-Digit-Selection Radix-4 Divider with Scaling Operation
Motonobu TONOMURA

PAPER

Vol:
E76-A No:4
Page(s):
593-602
This paper deals with the theory and design method of an efficient radix-4 divider using carry-propagation-free adders based on redundant binary {-1,0,+1} representation. The usual method of normalizing the divisor in the range [1/2,1) eliminates the advantages of using a higher radix than two, bacause many digits of the partial remainder are required to select the quotient digits. In the radix-4 case, it is shown that it is possible to select the quotient digits to refer to only the four (in the usual normalizing method it is seven) most significant digits of the partial remainder, by scaling the divisor in the range [12/8,13/8). This leads to radix-4 dividers more effective than radix-2 ones. We use the hyperstring graph representation proposed in Ref.(18) for redundant binary adders.

2641-2660hit(2741hit)

Keyword Search Result

[Keyword] PAR(2741hit)

A Trial on Distance Education and Training through the PARTNERS Network

Parameter Estimation of Uniform Image Blur Using DCT

A Bit-Parallel Block-Parallel Functional Memory Type Parallel Processor Architecture

Hardware Architecture for Kohonen Network

Three Dimensional Optical Interconnection Technology for Massively-Parallel Computing Systems

A High Speed, Switched-Capacitor Analog-to-Digital Converter Using Unity-Gain Buffers

Unified Scheduling of High Performance Parallel VLSI Processors for Robotics

Overlapped Partitioning Algorithm for the Solution of LSEs with Fixed Size Processor Array

Cancellation Technique of Parasitics in Active Filter Design

Behavior of Solutions Related to an Accuracy Exp(-1/ε)

A Hardware Architecture Design Methodology for Hidden Markov Model Based Recognition Systems Using Parallel Processing

A GaAs Monolithic Sampling Phase Frequency Comparator for Extending the Pull-In Range of Microwave Phase-Locked Oscillators

RHINE: Reconfigurable Multiprocessor System for Video CODEC

Parallel Viterbi Decoding Implementation by Multi-Microprocessors

A 10-b 300-MHz Interpolated-Parallel A/D Converter

Some Properties and a Necessary and Sufficient Condition for Extended Kleene-Stone Logic Functions

BiCMOS Circuit Techniques for 3.3 V Microprocessors

Optical Multiplex Computing Based on Set-Valued Logic and Its Application to Parallel Sorting Networks

Code Assignment Algorithm for Highly Parallel Multiple-Valued Combinational Circuits Based on Partition Theory

Simple Quotient-Digit-Selection Radix-4 Divider with Scaling Operation

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles