The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SPE(2504hit)

961-980hit(2504hit)

  • A Single-Chip Speech Dialogue Module and Its Evaluation on a Personal Robot, PaPeRo-Mini

    Miki SATO  Toru IWASAWA  Akihiko SUGIYAMA  Toshihiro NISHIZAWA  Yosuke TAKANO  

     
    PAPER-Digital Signal Processing

      Vol:
    E93-A No:1
      Page(s):
    261-271

    This paper presents a single-chip speech dialogue module and its evaluation on a personal robot. This module is implemented on an application processor that was developed primarily for mobile phones to provide a compact size, low power-consumption, and low cost. It performs speech recognition with preprocessing functions such as direction-of-arrival (DOA) estimation, noise cancellation, beamforming with an array of microphones, and echo cancellation. Text-to-speech (TTS) conversion is also equipped with. Evaluation results obtained on a new personal robot, PaPeRo-mini, which is a scale-down version of PaPeRo, demonstrate an 85% correct rate in DOA estimation, and as much as 54% and 30% higher speech recognition rates in noisy environments and during robot utterances, respectively. These results are shown to be comparable to those obtained by PaPeRo.

  • On the Importance of Transition Regions for Automatic Speaker Recognition

    Bong-Jin LEE  Chi-Sang JUNG  Jeung-Yoon CHOI  Hong-Goo KANG  

     
    LETTER-Speech and Hearing

      Vol:
    E93-D No:1
      Page(s):
    197-200

    This letter describes the importance of transition regions, e.g. at phoneme boundaries, for automatic speaker recognition compared with using steady-state regions. Experimental results of automatic speaker identification tasks confirm that transition regions include the most speaker distinctive features. A possible reason for obtaining such results is described in view of articulation, in particular, the degree of freedom of articulators. These results are expected to provide useful information in designing an efficient automatic speaker recognition system.

  • A Rapid Model Adaptation Technique for Emotional Speech Recognition with Style Estimation Based on Multiple-Regression HMM

    Yusuke IJIMA  Takashi NOSE  Makoto TACHIBANA  Takao KOBAYASHI  

     
    PAPER-Speech and Hearing

      Vol:
    E93-D No:1
      Page(s):
    107-115

    In this paper, we propose a rapid model adaptation technique for emotional speech recognition which enables us to extract paralinguistic information as well as linguistic information contained in speech signals. This technique is based on style estimation and style adaptation using a multiple-regression HMM (MRHMM). In the MRHMM, the mean parameters of the output probability density function are controlled by a low-dimensional parameter vector, called a style vector, which corresponds to a set of the explanatory variables of the multiple regression. The recognition process consists of two stages. In the first stage, the style vector that represents the emotional expression category and the intensity of its expressiveness for the input speech is estimated on a sentence-by-sentence basis. Next, the acoustic models are adapted using the estimated style vector, and then standard HMM-based speech recognition is performed in the second stage. We assess the performance of the proposed technique in the recognition of simulated emotional speech uttered by both professional narrators and non-professional speakers.

  • A Low Complexity Noise Suppressor with Hybrid Filterbanks and Adaptive Time-Frequency Tiling

    Osamu SHIMADA  Akihiko SUGIYAMA  Toshiyuki NOMURA  

     
    PAPER-Digital Signal Processing

      Vol:
    E93-A No:1
      Page(s):
    254-260

    This paper proposes a low complexity noise suppressor with hybrid filterbanks and adaptive time-frequency tiling. An analysis hybrid filterbank provides efficient transformation by further decomposing low-frequency bins after a coarse transformation with a short frame size. A synthesis hybrid filterbank also reduces computational complexity in a similar fashion to the analysis hybrid filterbank. Adaptive time-frequency tiling reduces the number of spectral gain calculations. It adaptively generates tiling information in the time-frequency plane based on the signal characteristics. The average number of instructions on a typical DSP chip has been reduced by 30% to 7.5 MIPS in case of mono signals sampled at 44.1 kHz. A Subjective test result shows that the sound quality of the proposed method is comparable to that of the conventional one.

  • A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM

    Takashi NOSE  Takao KOBAYASHI  

     
    PAPER-Speech and Hearing

      Vol:
    E93-D No:1
      Page(s):
    116-124

    In this paper, we propose a technique for estimating the degree or intensity of emotional expressions and speaking styles appearing in speech. The key idea is based on a style control technique for speech synthesis using a multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse of the style control. In the proposed technique, the acoustic features of spectrum, power, fundamental frequency, and duration are simultaneously modeled using the MRHSMM. We derive an algorithm for estimating explanatory variables of the MRHSMM, each of which represents the degree or intensity of emotional expressions and speaking styles appearing in acoustic features of speech, based on a maximum likelihood criterion. We show experimental results to demonstrate the ability of the proposed technique using two types of speech data, simulated emotional speech and spontaneous speech with different speaking styles. It is found that the estimated values have correlation with human perception.

  • Performance Analysis in Cognitive Radio Systems with Multiple Antennas

    Peng WANG  Xiaofeng ZHONG  Limin XIAO  Shidong ZHOU  Jing WANG  Yong BAI  

     
    LETTER-Wireless Communication Technologies

      Vol:
    E93-B No:1
      Page(s):
    182-186

    In this letter, the performance improvement by the deployment of multiple antennas in cognitive radio systems is studied from a system-level view. The term opportunistic spectrum efficiency (OSE) is defined as the performance metric to evaluate the spectrum opportunities that can actually be exploited by the secondary user (SU). By applying a simple energy combining detector, we show that deploying multiple antennas at the SU transceiver can improve the maximum achievable OSE significantly. Numerical results also reveal that the improvement comes from the reduction of both the detection overhead and the false alarm probability.

  • Discriminative Weight Training for Support Vector Machine-Based Speech/Music Classification in 3GPP2 SMV Codec

    Sang-Kyun KIM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E93-A No:1
      Page(s):
    316-319

    In this study, a discriminative weight training is applied to a support vector machine (SVM) based speech/music classification for a 3GPP2 selectable mode vocoder (SMV). In the proposed approach, the speech/music decision rule is derived by the SVM by incorporating optimally weighted features derived from the SMV based on a minimum classification error (MCE) method. This method differs from that of the previous work in that different weights are assigned to each feature of the SMV a novel process. According to the experimental results, the proposed approach is effective for speech/music classification using the SVM.

  • A Low-PAPR Multiplexed MC-CDMA System with Enhanced Data Rate and Link Quality

    Juinn-Horng DENG  Jeng-Kuang HWANG  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E93-B No:1
      Page(s):
    135-143

    Recently, a new multi-carrier CDMA (MC-CDMA) system with cyclic-shift orthogonal keying (CSOK) has been proposed and shown to be more spectral and power efficient than conventional MC-CDMA systems. In this paper, a novel extension called the multiplexed CSOK (MCSOK) MC-CDMA system is proposed to further increase the data rate while maintaining a low peak-to-average power ratio (PAPR). First, the data stream is divided into multiple parallel substreams that are mapped into QPSK-CSOK symbols in terms of cyclic shifted Chu sequences. Second, these sequences are repeated, modulated, summed, and placed on IFFT subcarriers, resulting in a constant-modulus multiplexed signal that preserves the desired orthogonality among substreams. The receiver performs frequency-domain equalization and uses efficient demultiplexing, despreading, and demapping schemes to detect the modulation symbols. Furthermore, an alternate MCSOK system configuration with high link quality is also presented. Simulations show that the proposed MCSOK system attains lower PAPR and BER, as compared to conventional MC-CDMA system using Walsh codes. Under a rich multipath environment, the high link quality configuration exhibits excellent performance with both diversity gain and MCSOK modulation gain.

  • Spectrum Sensing for Multiuser Network Based on Free Probability Theory

    Lei WANG  Baoyu ZHENG  Qingmin MENG  Chao CHEN  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E93-B No:1
      Page(s):
    65-72

    Based on Free Probability Theory (FPT), which has become an important branch of Random Matrix Theory (RMT), a new scheme of frequency band sensing for Cognitive Radio (CR) in Direct-Sequence Code-Division Multiple-Access (DS-CDMA) multiuser network is proposed. Unlike previous studies in the field, the new scheme does not require the knowledge of the spreading sequences of users and is related to the behavior of the asymptotic free behavior of random matrices. Simulation results show that the asymptotic claims hold true even for a small number of observations (which makes it convenient for time-varying topologies) outperforming classical energy detection scheme and another scheme based on random matrix theory.

  • Estimation of Radio Communication Distance along Random Rough Surface

    Junichi HONDA  Kazunori UCHIDA  Kwang-Yeol YOON  

     
    PAPER

      Vol:
    E93-C No:1
      Page(s):
    39-45

    This paper is concerned with the estimation of radio communication distance when both the transmitter and receiver are arbitrarily distributed on a random rough surface such as desert, terrain, sea surface and so on. First, we simulate electromagnetic wave propagation along the rough surface by using the discrete ray tracing method (DRTM) proposed by authors recently. Second, we determine three parameters by conjugate gradient method (CGM) combined with the method of least-squares. Finally, we derive an analytical expression which can estimate the maximum communication distance when the input power of a transmitter and the minimum detectable electric intensity of a receiver are specified. Random rough surfaces are assumed to be Gaussian, pn-th order power law or exponential distributions.

  • A Robust Secure Cooperative Spectrum Sensing Scheme Based on Evidence Theory and Robust Statistics in Cognitive Radio

    Nhan NGUYEN-THANH  Insoo KOO  

     
    PAPER-Spectrum Sensing

      Vol:
    E92-B No:12
      Page(s):
    3644-3652

    Spectrum sensing is a key technology within Cognitive Radio (CR) systems. Cooperative spectrum sensing using a distributed model provides improved detection for the primary user, which opens the CR system to a new security threat. This threat is the decrease of the cooperative sensing performance due to the spectrum sensing data falsification which is generated from malicious users. Our proposed scheme, based on robust statistics, utilizes only available past sensing nodes' received power data for estimating the distribution parameters of the primary signal presence and absence hypotheses. These estimated parameters are used to perform the Dempster-Shafer theory of evidence data fusion which causes the elimination of malicious users. Furthermore, in order to enhance performance, a node's reliability weight is supplemented along with the data fusion scheme. Simulation results indicate that our proposed scheme can provide a powerful capability in eliminating malicious users as well as a high gain of data fusion under various cases of channel condition.

  • Frequency-Domain Equalization for Coherent Optical Single-Carrier Transmission Systems

    Koichi ISHIHARA  Takayuki KOBAYASHI  Riichi KUDO  Yasushi TAKATORI  Akihide SANO  Yutaka MIYAMOTO  

     
    PAPER-Fiber-Optic Transmission for Communications

      Vol:
    E92-B No:12
      Page(s):
    3736-3743

    In this paper, we use frequency-domain equalization (FDE) to create coherent optical single-carrier (CO-SC) transmission systems that are very tolerant of chromatic dispersion (CD) and polarization mode dispersion (PMD). The efficient transmission of a 25-Gb/s NRZ-QPSK signal by using the proposed FDE is demonstrated under severe CD and PMD conditions. We also discuss the principle of FDE and some techniques suitable for implementing CO-SC-FDE. The results show that a CO-SC-FDE system is very tolerant of CD and PMD and can achieve high transmission rates over single mode fiber without optical dispersion compensation.

  • Dynamic Spectrum Access to the Combined Resource of Commercial and Public Safety Bands Based on a WCDMA Shared Network

    Hyoungsuk JEON  Sooyeol IM  Youmin KIM  Seunghee KIM  Jinup KIM  Hyuckjae LEE  

     
    LETTER-Spectrum Allocation

      Vol:
    E92-B No:12
      Page(s):
    3581-3585

    The public safety spectrum is generally under-utilized due to the unique traffic characteristics of bursty and mission critical. This letter considers the application of dynamic spectrum access (DSA) to the combined spectrum of public safety (PS) and commercial (CMR) users in a common shared network that can provide both PS and CMR services. Our scenario includes the 700 MHz Public/Private Partnership which was recently issued by the Federal Communications Commission. We first propose an efficient DSA mechanism to coordinate the combined spectrum, and then establish a call admission control that reflects the proposed DSA in a wideband code division multiple access based network. The essentials of our proposed DSA are opportunistic access to the public safety spectrum and priority access to the commercial spectrum. Simulation results show that these schemes are well harmonized in various network environments.

  • Trade-Off Analysis between Timing Error Rate and Power Dissipation for Adaptive Speed Control with Timing Error Prediction

    Hiroshi FUKETA  Masanori HASHIMOTO  Yukio MITSUYAMA  Takao ONOYE  

     
    PAPER-Logic Synthesis, Test and Verfication

      Vol:
    E92-A No:12
      Page(s):
    3094-3102

    Timing margin of a chip varies chip by chip due to manufacturing variability, and depends on operating environment and aging. Adaptive speed control with timing error prediction is promising to mitigate the timing margin variation, whereas it inherently has a critical risk of timing error occurrence when a circuit is slowed down. This paper presents how to evaluate the relation between timing error rate and power dissipation in self-adaptive circuits with timing error prediction. The discussion is experimentally validated using adders in subthreshold operation in a 90 nm CMOS process. We show a trade-off between timing error rate and power dissipation, and reveal the dependency of the trade-off on design parameters.

  • Low-Complexity Wideband LSF Quantization Using Algebraic Trellis VQ

    Abdellah KADDAI  Mohammed HALIMI  

     
    PAPER-Speech and Hearing

      Vol:
    E92-D No:12
      Page(s):
    2478-2486

    In this paper an algebraic trellis vector quantization (ATVQ) that introduces algebraic codebooks into trellis coded vector quantization (TCVQ) structure is presented. Low encoding complexity and minimum memory storage requirements are achieved using the proposed approach. It exploits advantages of both the TCVQ and the algebraic codebooks to know the delayed decision, the codebook widening, the low computational complexity and the no storage of codebook. This novel vector quantization scheme is used to encode the wideband speech line spectral frequencies (LSF) parameters. Experimental results on wideband speech have shown that ATVQ yields the same performance as the traditional split vector quantization (SVQ) and the TCVQ in terms of spectral distortion (SD). It can achieve a transparent quality at 47 bits/frame with a considerable reduction of memory storage and computation complexity when compared to SVQ and TCVQ.

  • Evaluation of Effective Conductivity of Copper-Clad Dielectric Laminate Substrates in Millimeter-Wave Bands Using Whispering Gallery Mode Resonators

    Thi Huong TRAN  Yuanfeng SHE  Jiro HIROKAWA  Kimio SAKURAI  Yoshinori KOGAMI  Makoto ANDO  

     
    PAPER-Electronic Materials

      Vol:
    E92-C No:12
      Page(s):
    1504-1511

    This paper presents a measurement method for determining effective conductivity of copper-clad dielectric laminate substrates in the millimeter-wave region. The conductivity is indirectly evaluated from measured resonant frequencies and unloaded Q values of a number of Whispering Gallery modes excited in a circular disk sample, which consists of a copper-clad dielectric substrate with a large diameter of 20-30 wavelengths. We can, therefore, obtain easily the frequency dependence of the effective conductivity of the sample under test in a wide range of frequency at once. Almost identical conductivity is predicted for two kinds of WG resonators (the copper-clad type and the sandwich type) with different field distribution; it is self-consistent and provides the important foundation for the method if not for the alternative method at this moment. We measure three kinds of copper foils in 55-65 GHz band, where the conductivity of electrodeposited copper foil is smaller than that of rolled copper foil and shiny-both-sides copper foil. The measured conductivity for the electrodeposited copper foil decreases with an increase in the frequency. The transmission losses measured for microstrip lines which are fabricated from these substrates are accurately predicted with the conductivity evaluated by this method.

  • Effective Prediction of Errors by Non-native Speakers Using Decision Tree for Speech Recognition-Based CALL System

    Hongcui WANG  Tatsuya KAWAHARA  

     
    PAPER-Speech and Hearing

      Vol:
    E92-D No:12
      Page(s):
    2462-2468

    CALL (Computer Assisted Language Learning) systems using ASR (Automatic Speech Recognition) for second language learning have received increasing interest recently. However, it still remains a challenge to achieve high speech recognition performance, including accurate detection of erroneous utterances by non-native speakers. Conventionally, possible error patterns, based on linguistic knowledge, are added to the lexicon and language model, or the ASR grammar network. However, this approach easily falls in the trade-off of coverage of errors and the increase of perplexity. To solve the problem, we propose a method based on a decision tree to learn effective prediction of errors made by non-native speakers. An experimental evaluation with a number of foreign students learning Japanese shows that the proposed method can effectively generate an ASR grammar network, given a target sentence, to achieve both better coverage of errors and smaller perplexity, resulting in significant improvement in ASR accuracy.

  • A Novel Dynamic Channel Access Scheme Using Overlap FFT Filter-Bank for Cognitive Radio

    Motohiro TANABE  Masahiro UMEHIRA  Koichi ISHIHARA  Yasushi TAKATORI  

     
    PAPER-Spectrum Allocation

      Vol:
    E92-B No:12
      Page(s):
    3589-3596

    An OFDMA based channel access scheme is proposed for dynamic spectrum access to utilize frequency spectrum efficiently. Though the OFDMA based scheme is flexible enough to change the bandwidth and channel of the transmitted signals, the OFDMA signal has large PAPR (Peak to Average Power Ratio). In addition, if the OFDMA receiver does not use a filter to extract sub-carriers before FFT (Fast Fourier Transform) processing, the designated sub-carriers suffer large interference from the adjacent channel signals in the FFT processing on the receiving side. To solve the problems such as PAPR and adjacent channel interference encountered in the OFDMA based scheme, this paper proposes a novel dynamic channel access scheme using overlap FFT filter-bank based on single carrier modulation. It also shows performance evaluation results of the proposed scheme by computer simulation.

  • Capacity Analysis of Cooperative Relaying Networks with Adaptive Relaying Scheme Selection

    Kunihiko TESHIMA  Koji YAMAMOTO  Hidekazu MURATA  Susumu YOSHIDA  

     
    PAPER-Network

      Vol:
    E92-B No:12
      Page(s):
    3744-3752

    In the present paper, the performance of cooperative relaying networks with adaptive relaying scheme selection is analyzed. Cooperative relaying is a new technique to achieve spatial diversity gain by using neighboring stations. However, when multiple stations transmit simultaneously, the number of interference signals increases. Therefore, the introduction of cooperative relaying in radio communication systems does not always increase the network capacity due to the co-channel interference. Therefore, in order to achieve high spectral efficiency, it is necessary to select cooperative relaying or non-cooperative relaying adaptively. Assuming both centralized and decentralized adaptive controls, the spectrum efficiency is evaluated. The performance under decentralized control is evaluated using a game-theoretic approach. Simulation results show that the introduction of cooperative relaying with centralized control always increases the spectral efficiency. On the other hand, Simulation results also show that, when each source selects a relaying scheme independently and selfishly to maximize its own spectral efficiency, the introduction of the cooperative relaying may reduce the spectral efficiency due to the increase in the number of interference signals.

  • A Simple MAC Protocol for Cognitive Wireless Networks

    Abdorasoul GHASEMI  S. Mohammad RAZAVIZADEH  

     
    PAPER-Protocols

      Vol:
    E92-B No:12
      Page(s):
    3693-3700

    A simple distributed Medium Access Control (MAC) protocol for cognitive wireless networks is proposed. It is assumed that the network is slotted, the spectrum is divided into a number of channels, and the primary network statistical aggregate traffic model on each channel is given by independent Bernoulli random variables. The objective of the cognitive MAC is to maximize the exploitation of the channels idle time slots. The cognitive users can achieve this aim by appropriate hopping between the channels at each decision stage. The proposed protocol is based on the rule of least failures that is deployed by each user independently. Using this rule, at each decision stage, a channel with the least number of recorded collisions with the primary and other cognitive users is selected for exploitation. The performance of the proposed protocol for multiple cognitive users is investigated analytically and verified by simulation. It is shown that as the number of users increases the user decision under this protocol comes close to the optimum decision to maximize its own utilization. In addition, to improve opportunity utilization in the case of a large number of cognitive users, an extension to the proposed MAC protocol is presented and evaluated by simulation.

961-980hit(2504hit)