IEICE global.ieice.org Site

Keyword Search Result

[Keyword] EE(4073hit)

2861-2880hit(4073hit)

A High-Speed and Area-Efficient Dual-Rail PLA Using Divided and Interdigitated Column Circuits
Hiroaki YAMAOKA Makoto IKEDA Kunihiro ASADA

PAPER-Integrated Electronics

Vol:
E87-C No:6
Page(s):
1069-1077
This paper presents a new high-speed and area-efficient dual-rail PLA. The proposed circuit includes three schemes: 1) a divided column scheme (DCS), 2) a programmable sense-amplifier activation scheme (PSAS), and 3) an interdigitated column scheme (ICS). In the DCS, a column circuit of a PLA is divided and each circuit operates in parallel. This enhances the performance of the PLA, and this scheme becomes more effective as input data bandwidth increases. The PSAS is used to generate an activation pulse for sense amplifiers in the PLA. In this scheme, the proposed delay generators enable to minimize a timing margin depending on process variations and operating conditions. The ICS is used to enhance the area-efficiency of the PLA, where a method of physical compaction is employed. This scheme is effective for circuits which have the regularity in logic function such as arithmetic circuits. As applications of the proposed PLA, a comparator, a priority encoder, and an incrementor for 128-bit data processing were designed. The proposed circuit design schemes achieved a 22.2% delay reduction and a 37.5% area reduction on average over the conventional high-speed and low-power PLA in a 0.13-µm CMOS technology with a supply voltage of 1.2 V.
Implementation of a Multi-Class Fair Queueing via Identification of the QoS-Aware Parameters
Daein JEONG Byeongseog CHOE

PAPER-Switching

Vol:
E87-B No:6
Page(s):
1524-1534
This paper proposes a novel method of identifying the design parameters for a practical implementation of the fair queueing discipline, which is capable of class-level delay control. The notion of class weight is introduced at first, and then the session weights are determined. This two-phase approach is favorable in terms of the scalability;that is, the overall complexity is dependent upon the number of classes only. We propose a packet scheduler referred to as the DPS (Delay-centric Processor Sharing) scheme which employs those design parameters to deliver class-wise delay bound services. The associated admission policy for delay guarantee is also derived. System analysis and derivation of the parameters have their origins in the understanding of the so-called system equation, which describes the dynamics of the class-level service share. The proposed design parameters are QoS-aware in that they are consistently refined depending on the system status. Several numerical and simulation results show that the DPS scheme is advantageous over other ones in terms of both resource efficiency and the robustness. Concerning the scalability, we show that an alternative tagging process of the DPS scheme is implementable with O(1) complexity with no significant degradation in delay performance.
High Resolution Local Polynomial Approximation Beamforming for Wide Band Moving Sources
Do-Hyun PARK Kyun-Kyung LEE

LETTER-Antenna and Propagation

Vol:
E87-B No:6
Page(s):
1770-1773
The current letter extends narrow band (NB) local polynomial approximation (LPA) beamforming to wide band (WB) rapidly moving sources. Instead of the conventional beamformer weight in NB LPA, the proposed method adopts the steered minimum variance (STMV) method that can achieve a high resolution with short time observations. The performance of the proposed algorithm is demonstrated via computer simulations.
Reflection Attack on a Generalized Key Agreement and Password Authentication Protocol
Wei-Chi KU Hui-Lung LEE Chien-Ming CHEN

LETTER-Fundamental Theories

Vol:
E87-B No:5
Page(s):
1386-1388
In this letter, we show that a key agreement and password authentication protocol proposed by Kwon and Song is potentially vulnerable to a reflection attack, and then suggest simple improvements.
Traffic Engineering with Constrained Multipath Routing in MPLS Networks
Youngseok LEE Yongho SEOK Yanghee CHOI

PAPER-Network

Vol:
E87-B No:5
Page(s):
1346-1356
A traffic engineering problem in a network consists of setting up paths between the edge nodes of the network to meet traffic demands while optimizing network performance. It is known that total traffic throughput in a network, or resource utilization, can be maximized if a traffic demand is split over multiple paths. However, the problem formulation and practical algorithms, which calculate the paths and the load-splitting ratios by taking bandwidth, the route constraints or policies into consideration, have not been much touched. In this paper, we formulate the constrained multipath-routing problems with the objective of minimizing the maximum of link utilization, while satisfying bandwidth, the maximum hop count, and the not-preferred node/link list in Linear Programming (LP). Optimal solutions of paths and load-splitting ratios found by an LP solver are shown to be superior to the conventional shortest path algorithm in terms of maximum link utilization, total traffic volume, and number of required paths. Then, we propose a heuristic algorithm with low computational complexity that finds near optimal paths and load-splitting ratios satisfying the given constraints. The proposed algorithm is applied to Multi-Protocol Label Switching (MPLS) that can permit explicit path setup, and it is tested in a fictitious backbone network. The experiment results show that the heuristic algorithm finds near optimal solutions.
A Decision Feedback Equalizing Receiver for the SSTL SDRAM Interface with Clock-Data Skew Compensation
Young-Soo SOHN Seung-Jun BAE Hong-June PARK Soo-In CHO

PAPER-Integrated Electronics

Vol:
E87-C No:5
Page(s):
809-817
A CMOS DFE (decision feedback equalization) receiver with a clock-data skew compensation was implemented for the SSTL (stub-series terminated logic) SDRAM interface. The receiver consists of a 2 way interleaving DFE input buffer for ISI reduction and a X2 over-sampling phase detector for finding the optimum sampling clock position. The measurement results at 1.2 Gbps operation showed the increase of voltage margin by about 20% and the decrease of time jitter in the recovered sampling clock by about 40% by equalization in an SSTL channel with 2 pF 4 stub load. Active chip area and power consumption are 3001000 µm2 and 142 mW, respectively, with a 2.5 V, 0.25 µm CMOS process.
Braid Groups in Cryptology
Eonkyung LEE

INVITED PAPER

Vol:
E87-A No:5
Page(s):
986-992
Braids have been studied by mathematicians for more than one century. Because they are so practical as to be used for cryptography, many cryptographers have been interested in them. For the last five years, there have been proposed some cryptographic applications and cryptanalyses in the area of braids. We survey the main examples of these results.
Exploring Human Speech Production Mechanisms by MRI
Kiyoshi HONDA Hironori TAKEMOTO Tatsuya KITAMURA Satoru FUJITA Sayoko TAKANO

INVITED PAPER

Vol:
E87-D No:5
Page(s):
1050-1058
Recent investigations using magnetic resonance imaging (MRI) of human speech organs have opened up new avenues of research. Visualization of the speech production system provides abundant information on the physiological and acoustic realization of human speech. This article summarizes the current status of MRI applications with respect to speech research as well as our own experience of discovery and re-evaluation of acoustic events emanating from the vocal tract and physiological mechanisms.
Speaker Adaptation Method for Acoustic-to-Articulatory Inversion using an HMM-Based Speech Production Model
Sadao HIROYA Masaaki HONDA

PAPER

Vol:
E87-D No:5
Page(s):
1071-1078
We present a speaker adaptation method that makes it possible to determine articulatory parameters from an unknown speaker's speech spectrum using an HMM (Hidden Markov Model)-based speech production model. The model consists of HMMs of articulatory parameters for each phoneme and an articulatory-to-acoustic mapping that transforms the articulatory parameters into a speech spectrum for each HMM state. The model is statistically constructed by using actual articulatory-acoustic data. In the adaptation method, geometrical differences in the vocal tract as well as the articulatory behavior in the reference model are statistically adjusted to an unknown speaker. First, the articulatory parameters are estimated from an unknown speaker's speech spectrum using the reference model. Secondly, the articulatory-to-acoustic mapping is adjusted by maximizing the output probability of the acoustic parameters for the estimated articulatory parameters of the unknown speaker. With the adaptation method, the RMS error between the estimated articulatory parameters and the observed ones is 1.65 mm. The improvement rate over the speaker independent model is 56.1 %.
Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition
Takashi FUKUDA Tsuneo NITTA

PAPER

Vol:
E87-D No:5
Page(s):
1110-1118
In this paper, we propose a noise-robust automatic speech recognition system that uses orthogonalized distinctive phonetic features (DPFs) as input of HMM with diagonal covariance. In an orthogonalized DPF extraction stage, first, a speech signal is converted to acoustic features composed of local features (LFs) and ΔP, then a multilayer neural network (MLN) with 153 output units composed of context-dependent DPFs of a preceding context DPF vector, a current DPF vector, and a following context DPF vector maps the LFs to DPFs. Karhunen-Loeve transform (KLT) is then applied to orthogonalize each DPF vector in the context-dependent DPFs, using orthogonal bases calculated from a DPF vector that represents 38 Japanese phonemes. Each orthogonalized DPF vector is finally decorrelated one another by using Gram-Schmidt orthogonalization procedure. In experiments, after evaluating the parameters of the MLN input and output units in the DPF extractor, the orthogonalized DPFs are compared with original DPFs. The orthogonalized DPFs are then evaluated in comparison with a standard parameter set of MFCCs and dynamic features. Next, noise robustness is tested using four types of additive noise. The experimental results show that the use of the proposed orthogonalized DPFs can significantly reduce the error rate in an isolated spoken-word recognition task both with clean speech and with speech contaminated by additive noise. Furthermore, we achieved significant improvements when combining the orthogonalized DPFs with conventional static MFCCs and ΔP.
A Study on Acoustic Modeling for Speech Recognition of Predominantly Monosyllabic Languages
Ekkarit MANEENOI Visarut AHKUPUTRA Sudaporn LUKSANEEYANAWIN Somchai JITAPUNKUL

PAPER

Vol:
E87-D No:5
Page(s):
1146-1163
This paper presents a study on acoustic modeling for speech recognition of predominantly monosyllabic languages. Various speech units used in speech recognition systems have been investigated. To evaluate the effectiveness of these acoustic models, the Thai language is selected, since it is a predominantly monosyllabic language and has a complex vowel system. Several experiments have been carried out to find the proper speech unit that can accurately create acoustic model and give a higher recognition rate. Results of recognition rates under different acoustic models are given and compared. In addition, this paper proposes a new speech unit for speech recognition, namely onset-rhyme unit. Two models are proposed-the Phonotactic Onset-Rhyme Model (PORM) and the Contextual Onset-Rhyme Model (CORM). The models comprise a pair of onset and rhyme units, which makes up a syllable. An onset comprises an initial consonant and its transition towards the following vowel. Together with the onset, the rhyme consists of a steady vowel segment and a final consonant. Experimental results show that the onset-rhyme model improves on the efficiency of other speech units. The onset-rhyme model improves on the accuracy of the inter-syllable triphone model by nearly 9.3% and of the context-dependent Initial-Final model by nearly 4.7% for the speaker-dependent systems using only an acoustic model, and 5.6% and 4.5% for the speaker-dependent systems using both acoustic and language model respectively. The results show that the onset-rhyme models attain a high recognition rate. Moreover, they also give more efficiency in terms of system complexity.
One-Pass Semi-Dynamic Network Decoding Using a Subnetwork Caching Model for Large Vocabulary Continuous Speech Recongnition
Dong-Hoon AHN Minhwa CHUNG

PAPER

Vol:
E87-D No:5
Page(s):
1164-1174
This paper presents a new decoding framework for large vocabulary continuous speech recognition that can handle a static search network dynamically. Generally, a static network decoder can use a search space that is globally optimized in advance, and therefore it can run at high speed during decoding. However, its large memory requirement due to the large network size or the spatial complexity of the optimization algorithm often makes it impractical. Our new one-pass semi-dynamic network decoding scheme aims at incorporating such an optimized search network with memory efficiency, but without losing speed. In this framework, a complete search network is organized on the basis of self-structuring subnetworks and is nearly minimized using a modified tail-sharing algorithm. While the decoder runs, it caches subnetworks needed for decoding in memory, whereas static network decoders keep the complete network in memory. The subnetwork caching model is controlled by two levels of caches: local cache obtained by subnetwork caching operations and global cache obtained by subnetwork preloading operations. The model can also be controlled adaptively by using subnetwork profiling operations. Furthermore, it is made simple and fast with compactly designed self-structuring subnetworks. Experimental results on a 25 k-word Korean broadcast news transcription task show that the semi-dynamic decoder can run almost as fast as an equivalent static network decoder under various memory configurations by using the subnetwork caching model.
Sounds of Speech Based Spoken Document Categorization: A Subword Representation Method
Weidong QU Katsuhiko SHIRAI

PAPER

Vol:
E87-D No:5
Page(s):
1175-1184
In this paper, we explore a method to the problem of spoken document categorization, which is the task of automatically assigning spoken documents into a set of predetermined categories. To categorize spoken documents, subword unit representations are used as an alternative to word units generated by either keyword spotting or large vocabulary continuous speech recognition (LVCSR). An advantage of using subword acoustic unit representations to spoken document categorization is that it does not require prior knowledge about the contents of the spoken documents and addresses the out of vocabulary (OOV) problem. Moreover, this method works in reliance on the sounds of speech rather than exact orthography. The use of subword units instead of words allows approximate matching on inaccurate transcriptions, makes "sounds-like" spoken document categorization possible. We also explore the performance of our method when the training set contains both perfect and errorful phonetic transcriptions, and hope the classifiers can learn from the confusion characteristics of recognizer and pronunciation variants of words to improve the robustness of whole system. Our experiments based on both artificial and real corrupted data sets show that the proposed method is more effective and robust than the word based method.
Pattern-Based Features vs. Statistical-Based Features in Decision Trees for Word Segmentation
Thanaruk THEERAMUNKONG Thanasan TANHERMHONG

PAPER-Natural Language Processing

Vol:
E87-D No:5
Page(s):
1254-1260
This paper proposes two alternative approaches that do not make use of a dictionary but instead utilizes different types of learned features to segment words in a language that has no explicit word boundary. Both methods utilize decision trees as knowledge representation acquired from a training corpus in the segmentation process. The first method, a language-dependent technique, applies a set of constructed features patterns based on character types to generate a set of heuristic segmentation rules. It separates a running text into a sequence of small chunks based on the given patterns, and constructs a decision tree for word segmentation. The second method extracts statistics of character sequences from a training corpus and uses them as features for the process of constructing a set of rules by decision tree induction. The latter needs no linguistic knowledge. By experiments on Thai language, both methods achieve relatively high accuracy but the latter performs much better.
Directions in Polynomial Reconstruction Based Cryptography
Aggelos KIAYIAS Moti YUNG

INVITED PAPER

Vol:
E87-A No:5
Page(s):
978-985
Cryptography and Coding Theory are closely related in many respects. Recently, the problem of "decoding Reed Solomon codes" (also known as "polynomial reconstruction") was suggested as an intractability assumption to base the security of protocols on. This has initiated a line of cryptographic research exploiting the rich algebraic structure of the problem and its variants. In this paper we give a short overview of the recent works in this area as well as list directions and open problems in Polynomial Reconstruction Based Cryptography.
Noise Robust Speech Recognition Using F₀ Contour Information
Koji IWANO Takahiro SEKI Sadaoki FURUI

PAPER

Vol:
E87-D No:5
Page(s):
1102-1109
This paper proposes a noise robust speech recognition method using prosodic information. In Japanese, the fundamental frequency (F0) contour represents phrase intonation and word accent information. Consequently, it conveys information about prosodic phrases and word boundaries. This paper first describes a noise robust F0 extraction method using the Hough transform, which achieves high extraction rates under various noise environments. Then it proposes a robust speech recognition method using multi-stream HMMs which model both segmental spectral and F0 contour information. Speaker-independent experiments are conducted using connected digits uttered by 11 male speakers in various kinds of noise and SNR conditions. The recognition error rate is reduced in all noise conditions, and the best absolute improvement of digit accuracy is about 4.5%. This improvement is achieved by robust digit boundary detection using the prosodic information.
What are the Essential Cues for Understanding Spoken Language?
Steven GREENBERG Takayuki ARAI

INVITED PAPER

Vol:
E87-D No:5
Page(s):
1059-1070
Classical models of speech recognition assume that a detailed, short-term analysis of the acoustic signal is essential for accurately decoding the speech signal and that this decoding process is rooted in the phonetic segment. This paper presents an alternative view, one in which the time scales required to accurately describe and model spoken language are both shorter and longer than the phonetic segment, and are inherently wedded to the syllable. The syllable reflects a singular property of the acoustic signal -- the modulation spectrum -- which provides a principled, quantitative framework to describe the process by which the listener proceeds from sound to meaning. The ability to understand spoken language (i.e., intelligibility) vitally depends on the integrity of the modulation spectrum within the core range of the syllable (3-10 Hz) and reflects the variation in syllable emphasis associated with the concept of prosodic prominence ("accent"). A model of spoken language is described in which the prosodic properties of the speech signal are embedded in the temporal dynamics associated with the syllable, a unit serving as the organizational interface among the various tiers of linguistic representation.
Complex Dielectric Image Green's Function via Pade Approximation for On-Chip Interconnects
Wenliang DAI Zhengfan LI Fuhua LI

PAPER-Microwaves, Millimeter-Waves

Vol:
E87-C No:5
Page(s):
772-777
The complex dielectric image Green's function for metal-insulator-semiconductor (MIS) technology is proposed in this paper through dielectric image method. Then the Epsilon algorithm for Pade approximation is used to accelerate the convergence of the infinite series summation resulted from the complex dielectric image Green's function. Because of the complex dielectric permittivity of semiconducting substrate, the real and imaginary part of the resulted Green's function is accelerated by Epsilon algorithm, respectively. Combined with the complex dielectric image Green's function, the frequency-dependent capacitance and conductance of the transmission lines and interconnects based on MIS technology are investigated through the method of moments (MoM). The computational results of our method for 2-D and 3-D extraction examples are well agreement with experimental data gained from chip measurement and other methods such as full-wave analysis and FastCap.
New Three-Level Boolean Expression Based on EXOR Gates
Ryoji ISHIKAWA Takashi HIRAYAMA Goro KODA Kensuke SHIMIZU

PAPER-Computer Components

Vol:
E87-D No:5
Page(s):
1214-1222
The utilization of EXOR gates often decreases the number of gates needed for realizing practical logical networks, and enhances the testability of networks. Therefore, logic synthesis with EXOR gates has been studied. In this paper we propose a new logic representation: an ESPP (EXOR-Sum-of-Pseudoproducts) form based on pseudoproducts. This form provides a new three-level network with EXOR gates. Some functional classes in ESPP forms can be realized with shorter expressions than in conventional forms such as the Sum-of-Products. Since many practical functions have the properties of such classes, the ESPP form is useful for making a compact form. We propose a heuristic minimization algorithm for ESPP, and we demonstrate the compactness of ESPPs by showing our experimental results. We apply our technique to some logic function classes and MCNC benchmark networks. The experimental results show that most ESPP forms have fewer literals than conventional forms.
Improved HMM Separation for Distant-Talking Speech Recognition
Tetsuya TAKIGUCHI Masafumi NISHIMURA

PAPER

Vol:
E87-D No:5
Page(s):
1127-1137
In distant-talking speech recognition, the recognition accuracy is seriously degraded by reverberation and environmental noise. A robust speech recognition technique in such environments, HMM separation and composition, has been described in. HMM separation estimates the model parameters of the acoustic transfer function using adaptation data uttered from an unknown position in noisy and reverberant environments, and HMM composition builds an HMM of noisy and reverberant speech, using the acoustic transfer function estimated by HMM separation. Previously, HMM separation has been applied to the acoustic transfer function based on a single Gaussian distribution. However the improvement was smaller than expected for the impulse response with long reverberations. This is because the variance of the acoustic transfer function in each frame increases, since the length of the impulse response of the room reverberation is longer than that of the spectral analysis window. In this paper, HMM separation is extended to estimate the acoustic transfer function based on the Gaussian mixture components in order to compensate for the greater variability of the acoustic transfer function, and the re-estimation formulae are derived. In addition, this paper introduces a technique to adapt the noise weight for each mel-spaced frequency in order to improve the performance of the HMM separation in the linear-spectral domain, since the use of the HMM separation in the linear-spectral domain sometimes causes a negative mean output due to the subtraction operation. The extended HMM separation is evaluated on distant-talking speech recognition tasks. The results of the experiments clarify the effectiveness of the proposed method.

2861-2880hit(4073hit)

Keyword Search Result

[Keyword] EE(4073hit)

A High-Speed and Area-Efficient Dual-Rail PLA Using Divided and Interdigitated Column Circuits

Implementation of a Multi-Class Fair Queueing via Identification of the QoS-Aware Parameters

High Resolution Local Polynomial Approximation Beamforming for Wide Band Moving Sources

Reflection Attack on a Generalized Key Agreement and Password Authentication Protocol

Traffic Engineering with Constrained Multipath Routing in MPLS Networks

A Decision Feedback Equalizing Receiver for the SSTL SDRAM Interface with Clock-Data Skew Compensation

Braid Groups in Cryptology

Exploring Human Speech Production Mechanisms by MRI

Speaker Adaptation Method for Acoustic-to-Articulatory Inversion using an HMM-Based Speech Production Model

Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition

A Study on Acoustic Modeling for Speech Recognition of Predominantly Monosyllabic Languages

One-Pass Semi-Dynamic Network Decoding Using a Subnetwork Caching Model for Large Vocabulary Continuous Speech Recongnition

Sounds of Speech Based Spoken Document Categorization: A Subword Representation Method

Pattern-Based Features vs. Statistical-Based Features in Decision Trees for Word Segmentation

Directions in Polynomial Reconstruction Based Cryptography

Noise Robust Speech Recognition Using F₀ Contour Information

What are the Essential Cues for Understanding Spoken Language?

Complex Dielectric Image Green's Function via Pade Approximation for On-Chip Interconnects

New Three-Level Boolean Expression Based on EXOR Gates

Improved HMM Separation for Distant-Talking Speech Recognition

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles