IEICE global.ieice.org Site

Keyword Search Result

[Keyword] SPE(2504hit)

2421-2440hit(2504hit)

Speculative Execution and Reducing Branch Penalty on a Superscalar Processor
Hideki ANDO Chikako NAKANISHI Hirohisa MACHIDA Tetsuya HARA Masao NAKAYA

PAPER-Improved Binary Digital Architectures

Vol:
E76-C No:7
Page(s):
1080-1093
Superscalar processors improve performance by exploiting instruction-level parallelism (ILP). ILP in a basic block is, however, not sufficient on non-numerical applications for gaining substantial speedup. Instructions across branches are required to be executed in parallel to dramatically improve performance. That is, speculative execution is strongly required. Boosting is a general solution to achieving speculative execution. Boosting labels an instruction to be speculatively executed, and the hardware handles side-effects. This paper describes the efficient implementation of boosting in terms of cost/performance trade-offs. Our policy in implementation is beneficial in code scheduling heuristics, penalties imposed by code duplication to maintain program semantics, and area cost. This paper also describes a branch scheme which minimizes branch penalty. Branch delay causes crucial penalties on the performance of superscalar processors since multiple delay slots exist even in a single delay cycle. Our scheme is the fetching of both sequential and target instructions, and either of them is selected on a branch. No delay cycle can be imposed. This scheme is realized by a combination of static code movement and hardware support. As a result, we reduce branch penalty with small cost. Simulation results show that our ideas are highly effective in improving the performance of a superscalar processor.
A Simplified Realization of Adaptive Notch Filter and Its Convergence Properties
Shotaro NISHIMURA

LETTER

Vol:
E76-A No:7
Page(s):
1147-1149
In this letter, a new structure of adaptive IIR notch filter is presented. The structure is based on direct form realization and uses the similar adaptation algorithm given in Ref. (4). A quantitative analysis for convergence properties is developed. It is shown that the proposed structure shows superior performance comparing with previously proposed designs. The results of computer simulations are presented to substantiate the analysis.
Pitch Synchronous Innovation CELP (PSI-CELP)
Takehiro MORIYA Satoshi MIKI Kazunori MANO Hitoshi OHMURO

LETTER

Vol:
E76-A No:7
Page(s):
1177-1180
A speech coding scheme at 3.6 kbit/s has been proposed. The scheme is based on CELP (Code Excited Linear Prediction) with pitch synchronous innovation, which means even random codevectors as well as adaptive codevectors have pitch periodicity. The quality is comparable to 6.7 kbit/s VSELP coder for the Japanese cellular radio standard.
A Hardware Architecture Design Methodology for Hidden Markov Model Based Recognition Systems Using Parallel Processing
Jun-ichi TAKAHASHI

PAPER-Digital Signal Processing

Vol:
E76-A No:6
Page(s):
990-1000
This paper presents a hardware architecture design methodology for hidden markov model based recognition systems. With the aim of realizing more advanced and user-friendly systems, an effective architecture has been studied not only for decoding, but also learning to make it possible for the system to adapt itself to the user. Considering real-time decoding and the efficient learning procedures, a bi-directional ring array processor is proposed, that can handle various kinds of data and perform a large number of computations efficiently using parallel processing. With the array architecture, HMM sub-algorithms, the forward-backward and Baum-Welch algorithms for learning and the Viterbi algorithm for decoding, can be performed in a highly parallel manner. The indispensable HMM implementation techniques of scaling, smoothing, and estimation for multiple observations can be also carried out in the array without disturbing the regularity of parallel processing. Based on the array processor, we propose the configuration of a system that can realize all HMM processes including vector quantization. This paper also describes that a high PE utilization efficiency of about 70% to 90% can be achieved for a practical left-to-right type HMMs.
Overlapped Partitioning Algorithm for the Solution of LSEs with Fixed Size Processor Array
Ben CHEN Mahoki ONODA

PAPER-Modeling and Simulation

Vol:
E76-A No:6
Page(s):
1011-1018
In this paper we present an Overlapped Block Gauss-Seidel (OBGS) algorithm for the solution of large scale LSEs (Linear System of Equations) based on array architecture which we have already proposed. Better partitioning for processor array usually requires (1) balanced block size, and (2) minimum coupling between blocks for better convergence. These conditions can well be satisfied by overlapping some variables in computation algorithm. The mathematical implication of overlapped partitioning is discussed at first, and some examples show the effectiveness of OBGS algorithm. Conclusion points out that the convergence properties can well be improved by proper choice of overlapped variables. An efficient algorithm is given for choosing block and variables in order to realize above conditions.
Unified Scheduling of High Performance Parallel VLSI Processors for Robotics
Bumchul KIM Michitaka KAMEYAMA Tatsuo HIGUCHI

PAPER-Parallel Processor Scheduling

Vol:
E76-A No:6
Page(s):
904-910
The performance of processing elements can be improved by the progress of VLSI circuit technology, while the communication overhead can not be negligible in parallel processing system. This paper presents a unified scheduling that allocates tasks having different task processing times in multiple processing elements. The objective function is formulated to measure communication time between processing elements. By employing constraint conditions, the scheduling efficiently generates an optimal solution using an integer programming so that minimum communication time can be achieved. We also propose a VLSI processor for robotics whose latency is very small. In the VLSI processor, the data transfer between two processing elements can be done very quickly, so that the communication cycle time is greatly reduced.
A Frequency Utilization Ffficiency Improvement on Superposed SSMA-QPSK Signal Transmission over High Speed QPSK Signals in Nonlinear Channels
Takatoshi SUGIYAMA Hiroshi KAZAMA Masahiro MORIKURA Shuji KUBOTA Shuzo KATO

PAPER

Vol:
E76-B No:5
Page(s):
480-487
This paper proposes a superposed SSMA (Spread Spectrum Multiple Access)-QPSK (Quadrature Phase Shift Keying) signal transmission scheme over high speed QPSK signals to achieve higher frequency utilization efficiency and to facilitate lower power transmitters for SSMA-QPSK signal transmission. Experimental results show that the proposed scheme which employs the coding-rate of one-half FEC (Forward Error Correction) and a newly proposed co-channel interference cancellation scheme for SSMA-QPSK signals can transmit twenty SSMA-QPSK channels simultaneously over a nonlinearly amplified high speed QPSK signal transmission channel and achieve as ten times SSMA channels transmission as that without co-channel interference cancellation when the SSMA-QPSK signal power to the high speed QPSK signal power ratio equals -30dB. Moreover, cancellation feasibility generation of the interference signals replica through practical hardware implementation is clarified.
Process and Device Technologies of CMOS Devices for Low-Voltage Operation
Masakazu KAKUMU

INVITED PAPER

Vol:
E76-C No:5
Page(s):
672-680
Process and device technologies of CMOS devices for low-voltage operation are described. First, optimum power-supply voltage for CMOS devices is examined in detail from the viewpoints of circuit performance, device reliability and power dissipation. As a result, it is confirmed that power-supply voltage can be reduced without any speed loss of the CMOS device. Based upon theoretical understanding, the author suggests that lowering threshold voltage and reduction of junction capacitance are indispensable for CMOS devices with low-voltage supply, in order to improve the circuit performance, as expected from MOS device scaling. Process and device technologies such as Silicon On Insulator (SOI) device, low-temperature operation and CMOS Shallow Junction Well FET (CMOS-SJET) structure are reviewed for reduction of the threshold voltage and junction capacitance which lead to high-seed operation of the COMS device at low-voltage.
A 10-b 300-MHz Interpolated-Parallel A/D Converter
Hiroshi KIMURA Akira MATSUZAWA Takashi NAKAMURA Shigeki SAWADA

PAPER

Vol:
E76-C No:5
Page(s):
778-786
This paper describes a monolithic 10-b A/D converter that realized a maximum conversion frequency of 300 MHz. Through the development of the interpolated-parallel scheme, the severe requirement for the transistor Vbe matching can be alleviated drastically, which improves differential nonlinearity (DNL) significantly to within 0.4 LSB. Furthermore, an extremely small input capacitance of 8 pF can be attained, which translates into better dynamic performance such as SNR of 56 dB and THD of 59 dB for an input frequency of 10 MHz. Additionally, the folded differential logic circuit has been developed to reduce the number of elements, power dissipation, and die area drastically. Consequently, the A/D converter has been implemented as a 9.0 4.2-mm2 chip integrating 36K elements, which consumes 4.0 W using a 1.0-µm-rule, 25-GHz ft, double-polysilicon self-aligned bipolar technology.
Fundametal Properties of Multiple-Valued Logic Functions Monotonic with Respect to Ambiguity
Kyoichi NAKASHIMA Noboru TAKAGI

PAPER-Logic and Logic Functions

Vol:
E76-D No:5
Page(s):
540-547
The paper considers multiple-valued logic systems having the property that the ambiguity of the system increases as the ambiguity of each component increases. The partial-ordering relation with respect to ambiguity with the greatest element 1/2 and minimal elements 0, 1 or simply the ambiguity relation is introduced in the set of truth values V {0, 1/ (p1), , 1/2, , (p2) / (p1), 1}. A-monotonic p-valued logic functions are defined as p-valued logic functions monotonic with respect to the ambiguity relation. A necessary and sufficient condition for A-monotonic p-valued logic functions is presented along with the proofs, and their logic formulae using unary operators defined in the ambiguity relation are given. Some discussions on the extension of theories to other partial-ordering relations are also given.
Quantum Theory, Computing and Chaotic Solitons
Paul J. WERBOS

PAPER-Chaos and Related Topics

Vol:
E76-A No:5
Page(s):
689-694
This paper describes new methematical tools, taken from quantum field theory (QFT), which may make it possible to characterize localized excitations (including solitons, but also including chaotic modes) generated by PDE systems. The significance to computer hardware and neurocomputing is also discussed. This mathematics--IF further developed--may also have the potential to reorganize and simplify our understanding of QFT itself--a topic of very great intellectual and practical importance. The paper concludes by describing three new possibilities for research, which will be very important to achieving these goals.
On the Specification for VLSI Systolic Arrays
Fuyau LIN

PAPER

Vol:
E76-A No:4
Page(s):
496-506
Formal verification has become an increasing prominent technique towards establishing the correctness of hardware designs. We present a framework to specifying and verifying the design of systolic architectures. Our approach allows users to represent systolic arrays in Z specification language and to justify the design semi-automatically using the verifier. Z is a notation based on typed set theory and enriched by a schema calculus. We describe how a systolic array for matrix-vector multiplication can be specified and justified with respect to its algorithm.
Packet Speech Transmission on ATM Networks Using a Variable Rate Embedded ADPCM Coding Scheme
Kazuhiro KONDO Masashi OHNO

PAPER-Communication Systems and Transmission Equipment

Vol:
E76-B No:4
Page(s):
420-430
Subjective quality tests have proven that embedded adaptive differential PCM (ADPCM), known to tolerate information loss through bit dropping, does not maintain sufficient speech quality when directly applied to asynchronous transfer mode (ATM) due to the fixed-length cell transmission scheme unique to ATM. We propose a coding and transmission scheme which enhances the performance by adjusting the embedded ADPCM coding rate according to input speech characteristics, thereby taking advantage of the ATM environment, where the transmission of variable rate sources is feasible. By varying the number of code bits of an embedded ADPCM coder from 6bits per sample, or 48kbps, for blocks of speech with a high prediction gain, to 2bits, or 16kbps, for silent blocks, a good compromise between coding bit rate and speech quality with gradual degradation due to information loss is achieved. The results of subjective evaluation tests showed the speech quality of the proposed scheme to be over 3.5 mean opinion score (MOS) on a scale of 1 to 5 at a cell loss rate of 10%. A prototype of the codec and the ATM cell assembly/disassembly functions were also fabricated using 3 conventional digital signal processors (DSPs) for real-time conversation tests.
Analysis/Synthesis of Speech Using the Short-Time Fourier Transform and a Time-Varying ARMA Process
Andreas SPANIAS Philipos LOIZOU Gim LIM Ye CHEN Gen HU

PAPER-Speech

Vol:
E76-A No:4
Page(s):
645-652
A speech analysis/synthesis system that relies on a time-varying Auto Regressive Moving Average (ARMA) process and the Short-Time Fourier Transform (STFT) is proposed. The narrowband components in speech are represented in the frequency domain by a set of harmonic components, while the broadband random components are represented by a time-varying ARMA process. The time-varying ARMA model has a dual function, namely, it creates a spectral envelope that fits accurately the harmonic STFT components, and provides for the spectral representation of the broadband components of speech. The proposed model essentially combines the features of waveform coders by employing the STFT and the features of traditional vocoders by incorporating an appropriately shaped noise sequence.
Effect of Noise-Only-Paths on the Performance Improvement of Post-Demodulation Selection Diversity in DS/SS Mobile Radio
Akihiro HIGASHI Tadashi MATSUMOTO Mohsen KAVEHRAD

PAPER-Radio Communication

Vol:
E76-B No:4
Page(s):
438-443
The path diversity improvement inherent in direct sequence spread spectrum (DS/SS) signalling under multi-path propagation environments is investigated for mobile/personal radio communications systems that employ DPSK modulation. The bit error rate (BER) performance of post-demodulation selection diversity reception is theoretically analyzed in the presence of noise-only-paths in the time window for diversity combining. Results of laboratory experiments conducted to evaluate the BER performance are also presented. It is shown that the experimental results agree well with the theoretical BER.
VLSI-Oriented Multiple-Valued Current-Mode Arithmetic Circuits Using Redundant Number Representations
Shoji KAWAHITO Yasuhiro MITSUI Tetsuro NAKAMURA

PAPER

Vol:
E76-C No:3
Page(s):
446-454
This paper presents a VLSI-oriented arithmetic design method using a radix-2 redundant number representation with digit set {0, 1, 2} and multiple-valued current-mode (MVCM) circuit technology. We propose a carry-propagation-free (CPF) parallel addition method with redundant digit set {0, 1, 2} which is suitable for the design with MVCM circuits. Several types of CPF parallel adders are compared and the proposed CPF parallel adder with MVCM circuits offers the best total performance with respect to speed, complexity, and power dissipation. The designed basic arithmetic circuits has sufficient noise immunity to the supply voltage fluctuation which is important for stable operations of the VLSI circuits. The CPF parallel adder is effectively used as the reduction scheme of partial products in a high-speed compact multiplier. For example, the designed 3232 bit multiplier reduces the number of active elements to two-third and the number of interconnections to one-fifth of the corresponding binary Wallace tree multiplier, where the speed is almost the same. The structure is simple and regular. The static power dissipation of the designed 32-bit multiplier is estimated to be the mean value of 212 mW and the worst case of 708 mW. The total power including dynamic power dissipation would not be so large compared with that of the 32-bit binary CMOS multiplier reported under 10 MHz operation.
Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques
Hiroshi HAMADA Satoshi MIKI Ryohei NAKATSU

PAPER-Speech Processing

Vol:
E76-D No:3
Page(s):
352-359
A new method is proposed for automatically evaluating the English pronunciation quality of non-native speakers. It is assumed that pronunciation can be rated using three criteria: the static characteristics of phonetic spectra, the dynamic structure of spectrum sequences, and the prosodic characteristics of utterances. The evaluation uses speech recognition techniques to compare the English words pronounced by a non-native speaker with those pronounced by a native speaker. Three evaluation measures are proposed to rate pronunciation quality. (1) The standard deviation of the mapping vectors, which map the codebook vectors of the non-native speaker onto the vector space of the native speaker, is used to evaluate the static phonetic spectra characteristics. (2) The spectral distance between words pronounced by the non-native speaker and those pronounced by the native speaker obtained by the DTW method is used to evaluate the dynamic characteristics of spectral sequences. (3) The differences in fundamental frequency and speech power between the pronunciation of the native and non-native speaker are used as the criteria for evaluating prosodic characteristics. Evaluation experiments are carried out using 441 words spoken by 10 Japanese speakers and 10 native speakers. One half of the 441 words was used to evaluate static phonetic spectra characteristics, and the other half was used to evaluate the dynamic characteristics of spectral sequences, as well as the prosodic characteristics. Based on the experimental results, the correlation between the evaluation scores and the scores determined by human judgement is found to be 0.90.
Text-Independent Speaker Recognition Using Neural Networks
Hiroaki HATTORI

PAPER-Speech Processing

Vol:
E76-D No:3
Page(s):
345-351
This paper describes a text-independent speaker recognition method using predictive neural networks. For text-independent speaker recognition, an ergodic model which allows transitions to any other state, including selftransitions, is adopted as the speaker model and one predictive neural network is assigned to each state. The proposed method was compared to quantization distortion based methods, HMM based methods, and a discriminative neural network based method through text-independent speaker identification experiments on 24 female speakers. The proposed method gave the highest identification rate of 100.0%, and the effectiveness of predictive neural networks for representing speaker individuality was clarified.
Characterization of Inverted Slot Line for Travelling Wave Optical Modulator
Tsukasa YONEYAMA Tohru IWASAKI

PAPER-Optical/Microwave Devices

Vol:
E76-C No:2
Page(s):
229-237
The inverted slot line (ISL) has been propoaed for millimeter-wave LiNbO3 optical modulator. It is simple in structure, and capable of achieving the perfect velocity matching between carrier and modulating waves. The excellent performance of the ISL optical modulator has been demonstrated at 100 GHz, and the extension into the 50 GHz range is being expected. This paper addresses the analysis of the ISL based on the spectral domain approach. The major results obtained here are the demonstration of the perfect velocity matching not only at 10 GHz but also at 50 GHz, and the characterization of the ISL in terms of effective refractive index, characteristic impedance, overlap integral factor and transmission loss. The depth of optical phase modulation is also estimated at 50 GHz to show a promising performance in the millimeter-wave frequency range. The effective refractive index and the characteristic impedance are found to be theoretically predictable, but the field profile, the overlap integral factor and the transmission loss are not necessarily in good agreement with measurements. As a result of analysis, it can be concluded that the Y-cut substrate is superior to the Z-cut substrate in the following respects: 1. Coupling with the surface wave mode hardly occurs near the operating frequency range. 2. The perfect velocity matching can be attained with a larger spacing between the electrode and the ground plane. 3. The transmission loss is smaller. 4. The field intensity and the voerlap integral factor do not seem to be much deteriorated in the actual ISL.
Hybrid Photonic-Microwave Systems and Devices
Peter R. HERCZFELD

INVITED PAPER

Vol:
E76-C No:2
Page(s):
191-197
Research in optical microwave interaction, at its earlier stages, was spured by the desire to make an optically fed and controlled phased array antenna with monolithic microwave integrated circuit (MMIC) transmit/receive (T/R) modules. In the first part of this paper experimental results are presented demonstrating an optically fed phased array antenna operating at C-band in the 5.5 to 5.8 GHz frequency range. The present system consists of two optically fed 14 subarrays with MMIC based active T/R modules. Custom designed fiber optic links have been employed to provide distribution of data and frequency reference signals to phased array antenna. One of the challenges of the future is the development of better interfaces between electronic (microwave) and optical components, including the chip level merging of photonic and electronic components on III-V compounds. This aspect of the research is covered in the second half of the paper.

2421-2440hit(2504hit)

Keyword Search Result

[Keyword] SPE(2504hit)

Speculative Execution and Reducing Branch Penalty on a Superscalar Processor

A Simplified Realization of Adaptive Notch Filter and Its Convergence Properties

Pitch Synchronous Innovation CELP (PSI-CELP)

A Hardware Architecture Design Methodology for Hidden Markov Model Based Recognition Systems Using Parallel Processing

Overlapped Partitioning Algorithm for the Solution of LSEs with Fixed Size Processor Array

Unified Scheduling of High Performance Parallel VLSI Processors for Robotics

A Frequency Utilization Ffficiency Improvement on Superposed SSMA-QPSK Signal Transmission over High Speed QPSK Signals in Nonlinear Channels

Process and Device Technologies of CMOS Devices for Low-Voltage Operation

A 10-b 300-MHz Interpolated-Parallel A/D Converter

Fundametal Properties of Multiple-Valued Logic Functions Monotonic with Respect to Ambiguity

Quantum Theory, Computing and Chaotic Solitons

On the Specification for VLSI Systolic Arrays

Packet Speech Transmission on ATM Networks Using a Variable Rate Embedded ADPCM Coding Scheme

Analysis/Synthesis of Speech Using the Short-Time Fourier Transform and a Time-Varying ARMA Process

Effect of Noise-Only-Paths on the Performance Improvement of Post-Demodulation Selection Diversity in DS/SS Mobile Radio

VLSI-Oriented Multiple-Valued Current-Mode Arithmetic Circuits Using Redundant Number Representations

Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques

Text-Independent Speaker Recognition Using Neural Networks

Characterization of Inverted Slot Line for Travelling Wave Optical Modulator

Hybrid Photonic-Microwave Systems and Devices

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles