The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] CMN(4hit)

1-4hit
  • Complex Noisy Independent Component Analysis by Negentropy Maximization

    Guobing QIAN  Liping LI  Hongshu LIAO  

     
    LETTER-Noise and Vibration

      Vol:
    E97-A No:12
      Page(s):
    2641-2644

    The maximization of non-Gaussianity is an effective approach to achieve the complex independent component analysis (ICA) problem. However, the traditional complex maximization of non-Gaussianity (CMN) algorithm does not consider the influence of noise. In this letter, a modification of the fixed-point algorithm is proposed for more practical occasions of the complex noisy ICA model. Simulations show that the proposed method demonstrates significantly improved performance over the traditional CMN algorithm in the noisy ICA model when the sample size is sufficient.

  • Distant Speech Recognition Using a Microphone Array Network

    Alberto Yoshihiro NAKANO  Seiichi NAKAGAWA  Kazumasa YAMAMOTO  

     
    PAPER-Microphone Array

      Vol:
    E93-D No:9
      Page(s):
    2451-2462

    In this work, spatial information consisting of the position and orientation angle of an acoustic source is estimated by an artificial neural network (ANN). The estimated position of a speaker in an enclosed space is used to refine the estimated time delays for a delay-and-sum beamformer, thus enhancing the output signal. On the other hand, the orientation angle is used to restrict the lexicon used in the recognition phase, assuming that the speaker faces a particular direction while speaking. To compensate the effect of the transmission channel inside a short frame analysis window, a new cepstral mean normalization (CMN) method based on a Gaussian mixture model (GMM) is investigated and shows better performance than the conventional CMN for short utterances. The performance of the proposed method is evaluated through Japanese digit/command recognition experiments.

  • Effective Acoustic Modeling for Pronunciation Quality Scoring of Strongly Accented Mandarin Speech

    Fengpei GE  Changliang LIU  Jian SHAO  Fuping PAN  Bin DONG  Yonghong YAN  

     
    PAPER-Speech and Hearing

      Vol:
    E91-D No:10
      Page(s):
    2485-2492

    In this paper we present our investigation into improving the performance of our computer-assisted language learning (CALL) system through exploiting the acoustic model and features within the speech recognition framework. First, to alleviate channel distortion, speaker-dependent cepstrum mean normalization (CMN) is adopted and the average correlation coefficient (average CC) between machine and expert scores is improved from 78.00% to 84.14%. Second, heteroscedastic linear discriminant analysis (HLDA) is adopted to enhance the discriminability of the acoustic model, which successfully increases the average CC from 84.14% to 84.62%. Additionally, HLDA causes the scoring accuracy to be more stable at various pronunciation proficiency levels, and thus leads to an increase in the speaker correct-rank rate from 85.59% to 90.99%. Finally, we use maximum a posteriori (MAP) estimation to tune the acoustic model to fit strongly accented test speech. As a result, the average CC is improved from 84.62% to 86.57%. These three novel techniques improve the accuracy of evaluating pronunciation quality.

  • Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN

    Longbiao WANG  Seiichi NAKAGAWA  Norihide KITAOKA  

     
    PAPER-ASR under Reverberant Conditions

      Vol:
    E91-D No:3
      Page(s):
    457-466

    In a distant-talking environment, the length of channel impulse response is longer than the short-term spectral analysis window. Conventional short-term spectrum based Cepstral Mean Normalization (CMN) is therefore, not effective under these conditions. In this paper, we propose a robust speech recognition method by combining a short-term spectrum based CMN with a long-term one. We assume that a static speech segment (such as a vowel, for example) affected by reverberation, can be modeled by a long-term cepstral analysis. Thus, the effect of long reverberation on a static speech segment may be compensated by the long-term spectrum based CMN. The cepstral distance of neighboring frames is used to discriminate the static speech segment (long-term spectrum) and the non-static speech segment (short-term spectrum). The cepstra of the static and non-static speech segments are normalized by the corresponding cepstral means. In a previous study, we proposed an environmentally robust speech recognition method based on Position-Dependent CMN (PDCMN) to compensate for channel distortion depending on speaker position, and which is more efficient than conventional CMN. In this paper, the concept of combining short-term and long-term spectrum based CMN is extended to PDCMN. We call this Variable Term spectrum based PDCMN (VT-PDCMN). Since PDCMN/VT-PDCMN cannot normalize speaker variations because a position-dependent cepstral mean contains the average speaker characteristics over all speakers, we also combine PDCMN/VT-PDCMN with conventional CMN in this study. We conducted the experiments based on our proposed method using limited vocabulary (100 words) distant-talking isolated word recognition in a real environment. The proposed method achieved a relative error reduction rate of 60.9% over the conventional short-term spectrum based CMN and 30.6% over the short-term spectrum based PDCMN.