Yusuke IJIMA Takashi NOSE Makoto TACHIBANA Takao KOBAYASHI
In this paper, we propose a rapid model adaptation technique for emotional speech recognition which enables us to extract paralinguistic information as well as linguistic information contained in speech signals. This technique is based on style estimation and style adaptation using a multiple-regression HMM (MRHMM). In the MRHMM, the mean parameters of the output probability density function are controlled by a low-dimensional parameter vector, called a style vector, which corresponds to a set of the explanatory variables of the multiple regression. The recognition process consists of two stages. In the first stage, the style vector that represents the emotional expression category and the intensity of its expressiveness for the input speech is estimated on a sentence-by-sentence basis. Next, the acoustic models are adapted using the estimated style vector, and then standard HMM-based speech recognition is performed in the second stage. We assess the performance of the proposed technique in the recognition of simulated emotional speech uttered by both professional narrators and non-professional speakers.
Lei WANG Baoyu ZHENG Qingmin MENG Chao CHEN
Based on Free Probability Theory (FPT), which has become an important branch of Random Matrix Theory (RMT), a new scheme of frequency band sensing for Cognitive Radio (CR) in Direct-Sequence Code-Division Multiple-Access (DS-CDMA) multiuser network is proposed. Unlike previous studies in the field, the new scheme does not require the knowledge of the spreading sequences of users and is related to the behavior of the asymptotic free behavior of random matrices. Simulation results show that the asymptotic claims hold true even for a small number of observations (which makes it convenient for time-varying topologies) outperforming classical energy detection scheme and another scheme based on random matrix theory.
Bong-Jin LEE Chi-Sang JUNG Jeung-Yoon CHOI Hong-Goo KANG
This letter describes the importance of transition regions, e.g. at phoneme boundaries, for automatic speaker recognition compared with using steady-state regions. Experimental results of automatic speaker identification tasks confirm that transition regions include the most speaker distinctive features. A possible reason for obtaining such results is described in view of articulation, in particular, the degree of freedom of articulators. These results are expected to provide useful information in designing an efficient automatic speaker recognition system.
Wimol SAN-UM Masayoshi TACHIBANA
An analog circuit testing scheme is presented. The testing technique is a sinusoidal fault signature characterization, involving the measurement of DC offset, amplitude, frequency and phase shift, and the realization of two crossing level voltages. The testing system is an extension of the IEEE 1149.4 standard through the modification of an analog boundary module, affording functionalities for both on-chip testing capability, and accessibility to internal components for off-chip testing. A demonstrating circuit-under-test, a 4th-order Gm-C low-pass filter, and the proposed analog testing scheme are implemented in a physical level using 0.18-µm CMOS technology, and simulated using Hspice. Both catastrophic and parametric faults are potentially detectable at the minimum parameter variation of 0.5%. The fault coverage associated with CMOS transconductance operational amplifiers and capacitors are at 94.16% and 100%, respectively. This work offers the enhancement of standardizing test approach, which reduces the complexity of testing circuit and provides non-intrusive analog circuit testing.
Arturo Arvizu MONDRAGON Juan-de-Dios Sachez LOPEZ Francisco-Javier Mendieta JIMENEZ
We present a BPSK coherent optical wireless link in a multiple-beam, multiple-aperture configuration. The data are recovered using the signal obtained by the coherent addition of a set of maximum likelihood optical phase estimates and a select-largest stage. The proposal offers higher performance than the combining methods commonly used in optical wireless systems with diversity transmission and coherent detection.
Miki SATO Toru IWASAWA Akihiko SUGIYAMA Toshihiro NISHIZAWA Yosuke TAKANO
This paper presents a single-chip speech dialogue module and its evaluation on a personal robot. This module is implemented on an application processor that was developed primarily for mobile phones to provide a compact size, low power-consumption, and low cost. It performs speech recognition with preprocessing functions such as direction-of-arrival (DOA) estimation, noise cancellation, beamforming with an array of microphones, and echo cancellation. Text-to-speech (TTS) conversion is also equipped with. Evaluation results obtained on a new personal robot, PaPeRo-mini, which is a scale-down version of PaPeRo, demonstrate an 85% correct rate in DOA estimation, and as much as 54% and 30% higher speech recognition rates in noisy environments and during robot utterances, respectively. These results are shown to be comparable to those obtained by PaPeRo.
Wen-An TSOU Wen-Shen WUEN Kuei-Ann WEN
A circuit technique to correct Vdd/PM distortion and improve efficiency as supply modulation of cascode class-E PAs has been proposed. The experimental result shows that the phase distortion can be improved from 20 degrees to 5 degrees. Moreover, a system co-simulation result demonstrated that the EVM can be improved from -17 dB to -19 dB.
This paper studies scattering and diffraction of a TE plane wave from a periodic surface with semi-infinite extent. By use of a combination of the Wiener-Hopf technique and a perturbation method, a concrete representation of the wavefield is explicitly obtained in terms of a sum of two types of Fourier integrals. It is then found that effects of surface roughness mainly appear on the illuminated side, but weakly on the shadow side. Moreover, ripples on the angular distribution of the first-order scattering in the shadow side are newly found as interference between a cylindrical wave radiated from the edge and an inhomogeneous plane wave supported by the periodic surface.
Mutsumi KOMURO Norihisa KOMODA
Through the analysis of Rayleigh model, an explanatory model for the quality effect of peer reviews is constructed. The review activities are evaluated by the defect removal rate at each phase. We made hypotheses on how these measurements are related to the product quality. These hypotheses are verified through regression analysis of actual project data, and concrete calculation formulae are obtained as a model. Making use of the mechanism to construct this model, we can develop a method for making concrete review plan and setting objective values to manage on-going review activities.
Hiroshi TOKITO Masahiro SASABE Go HASEGAWA Hirotaka NAKANO
Wireless mesh networks have been attracting many users in recent years. By connecting base stations (mesh nodes) with wireless connections, these network can achieve a wide-area wireless environment with flexible configuration and low cost at the risk of radio interference between wireless links. When we utilize wireless mesh networks as infrastructures for Internet access, all network traffic from mobile nodes goes through a gateway node that is directly connected to the wired network. Therefore, it is necessary to distribute the traffic load by deploying multiple gateway nodes. In this paper, we propose a spanning tree construction algorithm for TDMA-based wireless mesh networks with multiple gateway nodes so as to maximize the traffic volume transferred between the mesh network and the Internet (system throughput) by taking account of the traffic load on the gateway nodes, the access link capacity and radio interference. Through a performance evaluation, we show that the proposed algorithm increases the system throughput regardless of the bottleneck position and achieves up to 3.1 times higher system throughput than a conventional algorithm.
Osamu SHIMADA Akihiko SUGIYAMA Toshiyuki NOMURA
This paper proposes a low complexity noise suppressor with hybrid filterbanks and adaptive time-frequency tiling. An analysis hybrid filterbank provides efficient transformation by further decomposing low-frequency bins after a coarse transformation with a short frame size. A synthesis hybrid filterbank also reduces computational complexity in a similar fashion to the analysis hybrid filterbank. Adaptive time-frequency tiling reduces the number of spectral gain calculations. It adaptively generates tiling information in the time-frequency plane based on the signal characteristics. The average number of instructions on a typical DSP chip has been reduced by 30% to 7.5 MIPS in case of mono signals sampled at 44.1 kHz. A Subjective test result shows that the sound quality of the proposed method is comparable to that of the conventional one.
B. A. Hirantha Sithira ABEYSEKERA Takahiro MATSUDA Tetsuya TAKINE
In the IEEE 802.11 MAC protocol, access points (APs) are given the same priority as wireless terminals in terms of acquiring the wireless link, even though they aggregate several downlink flows. This feature leads to a serious throughput degradation of downlink flows, compared with uplink flows. In this paper, we propose a dynamic contention window control scheme for the IEEE 802.11e EDCA-based wireless LANs, in order to achieve fairness between uplink and downlink TCP flows while guaranteeing QoS requirements for real-time traffic. The proposed scheme first determines the minimum contention window size in the best-effort access category at APs, based on the number of TCP flows. It then determines the minimum and maximum contention window sizes in higher priority access categories, such as voice and video, so as to guarantee QoS requirements for these real-time traffic. Note that the proposed scheme does not require any modification to the MAC protocol at wireless terminals. Through simulation experiments, we show the effectiveness of the proposed scheme.
In this study, a discriminative weight training is applied to a support vector machine (SVM) based speech/music classification for a 3GPP2 selectable mode vocoder (SMV). In the proposed approach, the speech/music decision rule is derived by the SVM by incorporating optimally weighted features derived from the SMV based on a minimum classification error (MCE) method. This method differs from that of the previous work in that different weights are assigned to each feature of the SMV a novel process. According to the experimental results, the proposed approach is effective for speech/music classification using the SVM.
Hiroshi HIRAYAMA Nobuyoshi KIKUMA Kunio SAKAKIBARA
A new technique to estimate the Poynting vector distribution from near-magnetic-field measurement is proposed. To calculate the Poynting vector, both electric and magnetic field should be known. In the proposed method, only magnetic-field measurement of three orthogonal axes is required. Electric field is estimated from the measured magnetic field by using the Maxwell's equation. The modified Yee cell is employed to estimate electric field from the measured magnetic field. Finally, the Poynting vector is calculated from the measured magnetic field and the estimated electric field. Since the proposed method enables us to understand propagation direction of electro-magnetic energy, it can be utilized to locate an emission source and to investigate a mechanism of undesired emission. Experiments are carried out to discuss the accuracy and to validate practical usefulness.
Changchun XU Yanyi XU Gan LIU Kezhong LIU
Supporting quality-of-service (QoS) of multimedia communications over IEEE 802.11 based ad hoc networks is a challenging task. This paper develops a simple 3-D Markov chain model for queuing analysis of IEEE 802.11 MAC layer. The model is applied for performance analysis of voice communications over IEEE 802.11 single-hop ad hoc networks. By using the model, we finish the performance optimization of IEEE MAC layer and obtain the maximum number of voice calls in IEEE 802.11 ad hoc networks as well as the statistical performance bounds. Furthermore, we design a fully distributed call admission control (CAC) algorithm which can provide strict statistical QoS guarantee for voice communications over IEEE 802.11 ad hoc networks. Extensive simulations indicate the accuracy of the analytical model and the CAC scheme.
The architecture of ZigBee networks focuses on developing low-cost, low-speed ubiquitous communication between devices. The ZigBee technique is based on IEEE 802.15.4, which specifies the physical layer and medium access control (MAC) for a low rate wireless personal area network (LR-WPAN). Currently, numerous wireless sensor networks have adapted the ZigBee open standard to develop various services to promote improved communication quality in our daily lives. The problem of system and network reliability in providing stable services has become more important because these services will be stopped if the system and network reliability is unstable. The ZigBee standard has three kinds of networks; star, tree and mesh. The paper models the ZigBee protocol stack from the physical layer to the application layer and analyzes these layer reliability and mean time to failure (MTTF). Channel resource usage, device role, network topology and application objects are used to evaluate reliability in the physical, medium access control, network, and application layers, respectively. In the star or tree networks, a series system and the reliability block diagram (RBD) technique can be used to solve their reliability problem. However, a division technology is applied here to overcome the problem because the network complexity is higher than that of the others. A mesh network using division technology is classified into several non-reducible series systems and edge parallel systems. Hence, the reliability of mesh networks is easily solved using series-parallel systems through our proposed scheme. The numerical results demonstrate that the reliability will increase for mesh networks when the number of edges in parallel systems increases while the reliability quickly drops when the number of edges and the number of nodes increase for all three networks. More use of resources is another factor impact on reliability decreasing. However, lower network reliability will occur due to network complexity, more resource usage and complex object relationship.
Mohammad SOLEIMANI Abdollah KHOEI Khayrollah HADIDI Vahid Fagih DINAVARI
In this paper, new structure of Voltage-Mode MAX-MIN circuit are presented for nonlinear systems, fuzzy applications, neural network and etc. A differential pair with improved cascode current mirror is used to choose the desired input. The advantages of the proposed structure are high operating frequency, high precision, low power consumption, low area and simple expansion for multiple inputs by adding only three transistors for each extra input. The proposed circuit which is simulated by HSPICE in 0.35 µm CMOS process shows the total power consumption of 85 µW in 5 MHz operating frequency from a single 3.3-V supply. Also, the total area of the proposed circuit is about 420 µm2 for two input voltages, and would be negligibly increased for each extra input.
Go HASEGAWA Yuichiro HIRAOKA Masayuki MURATA
Recent research on overlay networks has revealed that user-perceived network performance could be improved by an overlay routing mechanism. The effectiveness of overlay routing is mainly a result of the policy mismatch between the overlay routing and the underlay IP routing operated by ISPs. However, this policy mismatch causes a "free-riding" traffic problem, which may become harmful to the cost structure of Internet Service Providers. In the present paper, we define the free-riding problem in the overlay routing and evaluate the degree of free-riding traffic to reveal the effect of the problem on ISPs. We introduce a numerical metric to evaluate the degree of the free-riding problem and confirm that most multihop overlay paths that have better performance than the direct path brings the free-riding problem. We also discuss the guidelines for selecting paths that are more effective than the direct path and that mitigate the free-riding problem.
Abdellah KADDAI Mohammed HALIMI
In this paper an algebraic trellis vector quantization (ATVQ) that introduces algebraic codebooks into trellis coded vector quantization (TCVQ) structure is presented. Low encoding complexity and minimum memory storage requirements are achieved using the proposed approach. It exploits advantages of both the TCVQ and the algebraic codebooks to know the delayed decision, the codebook widening, the low computational complexity and the no storage of codebook. This novel vector quantization scheme is used to encode the wideband speech line spectral frequencies (LSF) parameters. Experimental results on wideband speech have shown that ATVQ yields the same performance as the traditional split vector quantization (SVQ) and the TCVQ in terms of spectral distortion (SD). It can achieve a transparent quality at 47 bits/frame with a considerable reduction of memory storage and computation complexity when compared to SVQ and TCVQ.
Kai HUANG Zhikuang CAI Xin CHEN Longxing SHI
This paper proposes a novel delay-locked loop (DLL) with fast-locking property. The improved fast-locking successive approximation register-controlled (IFSAR) scheme can decrease the locking time to n+4 periods and be harmonic-free, where n is the bits' number of the control code for a delay line. According to the simulation result in 180 nm CMOS technology, the DLL can cover the operating range from 70 MHz to 500 MHz and dissipate 10.44 mW at 500 MHz.