The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Ada(1871hit)

1021-1040hit(1871hit)

  • Acoustic Model Adaptation Using First-Order Linear Prediction for Reverberant Speech

    Tetsuya TAKIGUCHI  Masafumi NISHIMURA  Yasuo ARIKI  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    908-914

    This paper describes a hands-free speech recognition technique based on acoustic model adaptation to reverberant speech. In hands-free speech recognition, the recognition accuracy is degraded by reverberation, since each segment of speech is affected by the reflection energy of the preceding segment. To compensate for the reflection signal we introduce a frame-by-frame adaptation method adding the reflection signal to the means of the acoustic model. The reflection signal is approximated by a first-order linear prediction from the observation signal at the preceding frame, and the linear prediction coefficient is estimated with a maximum likelihood method by using the EM algorithm, which maximizes the likelihood of the adaptation data. Its effectiveness is confirmed by word recognition experiments on reverberant speech.

  • Context-Dependent Boundary Model for Refining Boundaries Segmentation of TTS Units

    Lijuan WANG  Yong ZHAO  Min CHU  Frank K. SOONG  Jianlai ZHOU  Zhigang CAO  

     
    PAPER-Speech Synthesis

      Vol:
    E89-D No:3
      Page(s):
    1082-1091

    For producing high quality synthesis, a concatenation-based Text-to-Speech (TTS) system usually requires a large number of segmental units to cover various acoustic-phonetic contexts. However, careful manual labeling and segmentation by human experts, which is still the most reliable way to prepare such units, is labor intensive. In this paper we adopt a two-step procedure to automate the labeling, segmentation and refinement process. In the first step, coarse segmentation of speech data is performed by aligning speech signals with the corresponding sequence of Hidden Markov Models (HMMs). Then in the second step, segment boundaries are refined with a proposed Context-Dependent Boundary Model (CDBM). Classification and Regression Tree (CART) is adopted to organize available data into a structured hierarchical tree, where acoustically similar boundaries are clustered together to train tied CDBM models for boundary refinement. Optimal CDBM parameters and training conditions are found through a series of experimental studies. Comparing with manual segmentation reference, segmentation accuracy (within a tolerance of 20 ms) is improved by the CDBMs from 78.1% (baseline) to 94.8% in Mandarin Chinese and from 81.4% to 92.7% in English, with about 1,000 manually segmented sentences used in training the models. To further reduce the amount of manual data for training CDBMs of a new speaker, we adapt a well-trained CDBM via efficient adaptation algorithms. With only 10-20 manually segmented sentences as adaptation data, the adapted CDBM achieves a segmentation accuracy of 90%.

  • Single-Channel Multiple Regression for In-Car Speech Enhancement

    Weifeng LI  Katsunobu ITOU  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Speech Enhancement

      Vol:
    E89-D No:3
      Page(s):
    1032-1039

    We address issues for improving hands-free speech enhancement and speech recognition performance in different car environments using a single distant microphone. This paper describes a new single-channel in-car speech enhancement method that estimates the log spectra of speech at a close-talking microphone based on the nonlinear regression of the log spectra of noisy signal captured by a distant microphone and the estimated noise. The proposed method provides significant overall quality improvements in our subjective evaluation on the regression-enhanced speech, and performed best in most objective measures. Based on our isolated word recognition experiments conducted under 15 real car environments, the proposed adaptive nonlinear regression approach shows an advantage in average relative word error rate (WER) reductions of 50.8% and 13.1%, respectively, compared to original noisy speech and ETSI advanced front-end (ETSI ES 202 050).

  • ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles

    Shigeki MATSUDA  Takatoshi JITSUHIRO  Konstantin MARKOV  Satoshi NAKAMURA  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    989-997

    In this paper, we describe a parallel decoding-based ASR system developed of ATR that is robust to noise type, SNR and speaking style. It is difficult to recognize speech affected by various factors, especially when an ASR system contains only a single acoustic model. One solution is to employ multiple acoustic models, one model for each different condition. Even though the robustness of each acoustic model is limited, the whole ASR system can handle various conditions appropriately. In our system, there are two recognition sub-systems which use different features such as MFCC and Differential MFCC (DMFCC). Each sub-system has several acoustic models depending on SNR, speaker gender and speaking style, and during recognition each acoustic model is adapted by fast noise adaptation. From each sub-system, one hypothesis is selected based on posterior probability. The final recognition result is obtained by combining the best hypotheses from the two sub-systems. On the AURORA-2J task used widely for the evaluation of noise robustness, our system achieved higher recognition performance than a system which contains only a single model. Also, our system was tested using normal and hyper-articulated speech contaminated by several background noises, and exhibited high robustness to noise and speaking styles.

  • A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features

    Makoto TACHIBANA  Junichi YAMAGISHI  Takashi MASUKO  Takao KOBAYASHI  

     
    PAPER-Speech Synthesis

      Vol:
    E89-D No:3
      Page(s):
    1092-1099

    This paper proposes a technique for synthesizing speech with a desired speaking style and/or emotional expression, based on model adaptation in an HMM-based speech synthesis framework. Speaking styles and emotional expressions are characterized by many segmental and suprasegmental features in both spectral and prosodic features. Therefore, it is essential to take account of these features in the model adaptation. The proposed technique called style adaptation, deals with this issue. Firstly, the maximum likelihood linear regression (MLLR) algorithm, based on a framework of hidden semi-Markov model (HSMM) is presented to provide a mathematically rigorous and robust adaptation of state duration and to adapt both the spectral and prosodic features. Then, a novel tying method for the regression matrices of the MLLR algorithm is also presented to allow the incorporation of both the segmental and suprasegmental speech features into the style adaptation. The proposed tying method uses regression class trees with contextual information. From the results of several subjective tests, we show that these techniques can perform style adaptation while maintaining naturalness of the synthetic speech.

  • Adaptive Clock Recovery Method Utilizing Proportional-Integral-Derivative (PID) Control for Circuit Emulation

    Youichi FUKADA  Takeshi YASUDA  Shuji KOMATSU  Koichi SAITO  Yoichi MAEDA  Yasuyuki OKUMURA  

     
    PAPER

      Vol:
    E89-B No:3
      Page(s):
    690-695

    This paper describes a novel adaptive clock recovery method that uses proportional-integral-derivative (PID) control. The adaptive clock method is a clock recovery technique that synchronizes connected terminals via packet networks, and will be indispensable for circuit emulation services in the next generation Ethernet. Our adaptive clock method simultaneously achieves a short starting-time, accuracy, stable recovery clock frequency, and few buffer delays using the PID control technique. We explain the numerical simulations, experimental results, and circuit designs.

  • New Formula of the Polarization Entropy

    Jian YANG  Yilun CHEN  Yingning PENG  Yoshio YAMAGUCHI  Hiroyoshi YAMADA  

     
    LETTER-Sensing

      Vol:
    E89-B No:3
      Page(s):
    1033-1035

    In this letter, a new formula is proposed for calculating the polarization entropy, based on the least square method. There is no need to calculate the eigenvalues of a covariance matrix as well as to use logarithms of values. So the time for computing the polarization entropy is reduced. Using polarimetric SAR data, the authors validate the effectiveness of the new formula.

  • Coverage Shrinking and Available Data Rate Variations for 3G CDMA Mobile Cellular Systems

    Yuh-Ren TSAI  Kai-Jie YANG  

     
    PAPER-Network

      Vol:
    E89-B No:3
      Page(s):
    739-747

    In 3G CDMA mobile communication systems, high data rate services are essential for many key applications. When an MS approaches the cell border, link performance is degraded and more power should be allocated to maintain the link performance. Since the maximum available signal power is limited, the link adaptation mechanism may diminish the data rate to maintain link performance. This implies that the valid coverage shrinks when the data rate increases. The shrinking of valid coverage under a predetermined data rate will strongly impact on the reliability of high data rate services. In this work, the encoded bit error probabilities of 3G CDMA mobile communication systems, over large-scale and large-small-scale fading channels, were analyzed based on SGA and SIGA methods. Analytic methods were also proposed to investigate the issues of coverage shrinking and service data rate variations. Furthermore, the outage probability, cell coverage percentage and the staying probabilities of available data rates were well examined. The proposed analytic methods can be applied, as a preliminary research, to the design of cellular-system-related techniques, such as QoS control, available data rate prediction, power reservation, and service adaptation.

  • Improving Rapid Unsupervised Speaker Adaptation Based on HMM-Sufficient Statistics in Noisy Environments Using Multi-Template Models

    Randy GOMEZ  Akinobu LEE  Tomoki TODA  Hiroshi SARUWATARI  Kiyohiro SHIKANO  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    998-1005

    This paper describes the method of using multi-template unsupervised speaker adaptation based on HMM-Sufficient Statistics to push up the adaptation performance while keeping adaptation time within few seconds with just one arbitrary utterance. This adaptation scheme is mainly composed of two processes. The first part is done offline which involves the training of multiple class-dependent acoustic models and the creation of speakers' HMM-Sufficient Statistics based on gender and age. The second part is performed online where adaptation begins using the single utterance of a test speaker. From this utterance, the system will classify the speaker's class and consequently select the N-best neighbor speakers close to the utterance using Gaussian Mixture Models (GMM). The classified speakers' class template model is then adopted as a base model. From this template model, the adapted model is rapidly constructed using the N-best neighbor speakers' HMM-Sufficient Statistics. Experiments in noisy environment conditions with 20 dB, 15 dB and 10 dB SNR office, crowd, booth, and car noise are performed. The proposed multi-template method achieved 89.5% word accuracy rate compared with 88.1% of the conventional single-template method, while the baseline recognition rate without adaptation is 86.4%. Moreover, experiments using Vocal Tract Length Normalization (VTLN) and supervised Maximum Likelihood Linear Regression (MLLR) are also compared.

  • Adaptive Linear Detectors in Space-Time Block Coded Multiuser Systems

    Hyeon Chyeol HWANG  Seung Hoon SHIN  Seok Ho KIM  Kyung Sup KWAK  

     
    LETTER-Wireless Communication Technologies

      Vol:
    E89-B No:3
      Page(s):
    999-1002

    In this letter, we propose adaptive linear detectors in space-time block coded multiuser systems, by exploiting a particular property of the minimum mean square error multiuser detector. The proposed scheme can provide much faster convergence than the existing adaptive scheme [5] and so lower the system overhead requirements.

  • An Adaptive Algorithm with Variable Step-Size for Parallel Notch Filter

    Arata KAWAMURA  Youji IIGUNI  Yoshio ITOH  

     
    PAPER-Digital Signal Processing

      Vol:
    E89-A No:2
      Page(s):
    511-519

    A parallel notch filter (PNF) for eliminating a sinusoidal signal whose frequency and phase are unknown, has been proposed previously. The PNF achieves both fast convergence and high estimation accuracy when the step-size for adaptation is appropriately determined. However, there has been no discussion of how to determine the appropriate step-size. In this paper, we derive the convergence condition on the step-size, and propose an adaptive algorithm with variable step-size so that convergence of the PNF is automatically satisfied. Moreover, we present a new filtering structure of the PNF that increases the convergence speed while keeping the estimation accuracy. We also derive a variable step-size scheme for the new PNF to guarantee the convergence. Simulation results show the effectiveness of the proposed method.

  • Transient Analysis of Complex-Domain Adaptive Threshold Nonlinear Algorithm (c-ATNA) for Adaptive Filters in Applications to Digital QAM Systems

    Shin'ichi KOIKE  

     
    PAPER-Digital Signal Processing

      Vol:
    E89-A No:2
      Page(s):
    469-478

    The paper presents an adaptive algorithm named adaptive threshold nonlinear algorithm for use in adaptive filters in the complex-number domain (c-ATNA) in applications to digital QAM systems. Although the c-ATNA is very simple to implement, it makes adaptive filters highly robust against impulse noise and at the same time it ensures filter convergence as fast as that of the well-known LMS algorithm. Analysis is developed to derive a set of difference equations for calculating transient behavior as well as steady-state performance. Experiment with simulations and theoretical calculations for some examples of filter convergence in the presence of Contaminated Gaussian Noise demonstrates that the c-ATNA is effective in combating impulse noise. Good agreement between simulated and theoretical convergence proves the validity of the analysis.

  • Performance of Feedback-Type Adaptive Array Antenna in FDD System with Rake Receiver

    Mona SHOKAIR  Yoshihiko AKAIWA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E89-B No:2
      Page(s):
    539-544

    The performance of a feedback-type adaptive array antenna (AAA) system placed only at a base station (BS) in an FDD/DS-CDMA system remains insufficiently clear. We evaluate the performance of this system by considering the effect of a rake receiver, spacing distance between antennas, the maximum Doppler frequency (fd), and control delay time (Td) on BER performance. In this system, the mobile station (MS) determines optimum weights of antenna elements and sends them back to BS as feedback information. We assume that the optimum weights are not quantized. Thereby, we estimate the performance degradation of 3GPP transmit diversity system, where the feedback information is quantized using a few bits. Computer simulation results show that the rake receiver achieves better BER performance because of the time diversity effect with rake receiver. The AAA with a wide antenna spacing gives high diversity gain for the received signals. For a high value of fd Td, BER performance becomes worse because weighting factors cannot follow the changing speed of channel characteristics. The degradation in performance of a 3GPP system is clarified.

  • A New Linear Transconductor Combining a Source Coupled Pair with a Transconductor Using Bias-Offset Technique

    Isamu YAMAGUCHI  Fujihiko MATSUMOTO  Makoto IZUMA  Yasuaki NOGUCHI  

     
    PAPER

      Vol:
    E89-A No:2
      Page(s):
    369-376

    Linearity of a transconductor with a theoretical linear characteristic is deteriorated by mobility degradation, in practice. In this paper, a technique to improve the linearity by combining a source-coupled pair with the transconductor is proposed. The proposed transconductor is the circuit that the deteriorated linearity of the conventional part is compensated by the transconductance characteristic of the source-coupled pair. In order to confirm the validity of the proposed technique, SPICE simulation is carried out. The transconductance change ratio of the proposed technique is about 1% and is 1/10 or less of the conventional circuit.

  • A Noise Reduction System for Wideband and Sinusoidal Noise Based on Adaptive Line Enhancer and Inverse Filter

    Naoto SASAOKA  Keisuke SUMI  Yoshio ITOH  Kensaku FUJII  Arata KAWAMURA  

     
    PAPER-Digital Signal Processing

      Vol:
    E89-A No:2
      Page(s):
    503-510

    A noise reduction technique to reduce wideband and sinusoidal noise in a noisy speech is proposed. In an actual environment, background noise includes not only wideband noise but also sinusoidal noise, such as ventilation fan and engine noise. In this paper, we propose a new noise reduction system which uses two types of adaptive line enhancers (ALE) and a noise estimation filter (NEF). First, the two ALEs are used to estimate speech components. The first ALE is used to reduce sinusoidal noise superposed on speech and wideband noise, while the second ALE is used to reduce wideband noise superposed on speech. However, since the quality of the speech enhanced by two ALEs is not good enough due to the difficulty in estimating unvoiced sound using the two ALEs, the NEF is used to improve on noise reduction capability. The NEF accurately estimates the background noise from the signal occupied by noise components, which is obtained by subtracting the speech enhanced by two ALEs from noisy speech. The enhanced speech is obtained by subtracting the estimated noise from noisy speech. Furthermore, the noise reduction system with feedback path is proposed to improve further the quality of enhanced speech.

  • Development and Implementation of an Interactive Parallelization Assistance Tool for OpenMP: iPat/OMP

    Makoto ISHIHARA  Hiroki HONDA  Mitsuhisa SATO  

     
    PAPER-Parallel/Distributed Programming Models, Paradigms and Tools

      Vol:
    E89-D No:2
      Page(s):
    399-407

    iPat/OMP is an interactive parallelization assistance tool for OpenMP. In the present paper, we describe the design concept of iPat/OMP, the parallelization sequence achieved by the tool and its current implementation status. In addition, we present an evaluation of the performance of the implemented functionalities. The experimental results show that iPat/OMP can detect parallelism and create an appropriate OpenMP directive for several for-loops.

  • Stochastic Method of Determining Substream Modulation Levels for MIMO Eigenbeam Space Division Multiplexing

    Satoshi TAKAHASHI  Chang-Jun AHN  Hiroshi HARADA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E89-B No:1
      Page(s):
    142-149

    Multiple-input multiple-output (MIMO) eigenbeam space division multiplexing that uses adaptive modulations for substreams is a promising technology for improving transmission capacity. A fundamental drawback of this approach is that the modulation levels determined from the carrier-to-noise ratio at each substream are sometimes overly optimistic so the use of these modulation levels results in transmission errors and diminished transmission performance. A novel method of determining substream modulation levels is proposed that alleviates this degradation. In the proposed method, the expected bit error rates for possible modulations of each substream are calculated from delay profiles. Simulation results indicate that transmission capacity is improved by 30% using the new method compared with the conventional method.

  • A Near-Optimal Low-Complexity Transceiver for CP-Free Multi-Antenna OFDM Systems

    Chih-Yuan LIN  Jwo-Yuh WU  Ta-Sung LEE  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E89-B No:1
      Page(s):
    88-99

    Conventional orthogonal frequency division multiplexing (OFDM) system utilizes cyclic prefix (CP) to remove the channel-induced inter-symbol interference (ISI) at the cost of lower spectral efficiency. In this paper, a generalized sidelobe canceller (GSC) based equalizer for ISI suppression is proposed for uplink multi-antenna OFDM systems without CP. Based on the block representation of the CP-free OFDM system, there is a natural formulation of the ISI suppression problem under the GSC framework. By further exploiting the signal and ISI signature matrix structures, a computationally efficient partially adaptive (PA) implementation of the GSC-based equalizer is proposed for complexity reduction. The proposed scheme can be extended for the design of a pre-equalizer, which pre-suppresses the ISI and realizes CP-free downlink transmission to ease the computational burden of the mobile unit (MU). Simulation results show that the proposed GSC-based solutions yield equalization performances almost identical to that obtained by the conventional CP-based OFDM systems and are highly resistant to the increase in channel delay spread.

  • Polarimetric Scattering Analysis for a Finite Dihedral Corner Reflector

    Kei HAYASHI  Ryoichi SATO  Yoshio YAMAGUCHI  Hiroyoshi YAMADA  

     
    PAPER-Sensing

      Vol:
    E89-B No:1
      Page(s):
    191-195

    This paper examines polarimetric scattering characteristics caused by a dihedral corner reflector of finite size. The dihedral corner reflector is a basic model of double-bounce structure in urban area. The detailed scattering information serves the interpretation of Polarimetric Synthetic Aperture Radar (POLSAR) data analysis. The Finite-Difference Time-Domain (FDTD) method is utilized for the scattering calculation because of its simplicity and flexibility in the target shape modeling. This paper points out that there exists a stable double-bounce squint angle region both for perfect electric conductor (PEC) and dielectric corner reflectors. Beyond this stable squint angular region, the scattering characteristics become completely different from the assumed response. A criterion on the double-bounce scattering is proposed based on the physical optics (PO) approximation. The detailed analyses on the polarimetric index (co-polarization ratio) with respect to squint angle and an experimental result measured in an anechoic chamber are shown.

  • Coefficients--Delay Simultaneous Adaptation Scheme for Linear Equalization of Nonminimum Phase Channels

    Yusuke TSUDA  Jonah GAMBA  Tetsuya SHIMAMURA  

     
    PAPER-Digital Signal Processing

      Vol:
    E89-A No:1
      Page(s):
    248-259

    An efficient adaptation technique of the delay is introduced for accomplishing more accurate adaptive linear equalization of nonminimum phase channels. It is focused that the filter structure and adaptation procedure of the adaptive Butler-Cantoni (ABC) equalizer is very suitable to deal with a variable delay for each iteration, compared with a classical adaptive linear transversal equalizer (LTE). We derive a cost function by comparing the system mismatch of an optimum equalizer coefficient vector with an equalizer coefficient vector with several delay settings. The cost function is square of difference of absolute values of the first element and the last element for the equalizer coefficient vector. The delay adaptation method based on the cost function is developed, which is involved with the ABC equalizer. The delay is adapted by checking the first and last elements of the equalizer coefficient vector and this results in an LTE providing a lower mean square error level than the other LTEs with the same order. We confirm the performance of the ABC equalizer with the delay adaptation method through computer simulations.

1021-1040hit(1871hit)