The search functionality is under construction.
The search functionality is under construction.

Author Search Result

[Author] Shoji KAJITA(3hit)

1-3hit
  • Speech Enhancement Using Nonlinear Microphone Array Based on Complementary Beamforming

    Hiroshi SARUWATARI  Shoji KAJITA  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER

      Vol:
    E82-A No:8
      Page(s):
    1501-1510

    This paper describes a spatial spectral subtraction method by using the complementary beamforming microphone array to enhance noisy speech signals for speech recognition. The complementary beamforming is based on two types of beamformers designed to obtain complementary directivity patterns with respect to each other. In this paper, it is shown that the nonlinear subtraction processing with complementary beamforming can result in a kind of the spectral subtraction without the need for speech pause detection. In addition, the optimization algorithm for the directivity pattern is also described. To evaluate the effectiveness, speech enhancement experiments and speech recognition experiments are performed based on computer simulations under both stationary and nonstationary noise conditions. In comparison with the optimized conventional delay-and-sum (DS) array, it is shown that: (1) the proposed array improves the signal-to-noise ratio (SNR) of degraded speech by about 2 dB and performs more than 20% better in word recognition rates under the conditions that the white Gaussian noise with the input SNR of -5 or -10 dB is used, (2) the proposed array performs more than 5% better in word recognition rates under the nonstationary noise conditions. Also, it is shown that these improvements of the proposed array are same as or superior to those of the conventional spectral subtraction method cascaded with the DS array.

  • Noise Robust Speech Recognition Using Subband-Crosscorrelation Analysis

    Shoji KAJITA  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E81-D No:10
      Page(s):
    1079-1086

    This paper describes subband-crosscorrelation analysis (SBXCOR) using two input channel signals. SBXCOR is an extended signal processing technique of subband-autocorrelation analysis (SBCOR) that extracts periodicities associated with the inverse of center frequencies present in speech signals. In addition, to extract more periodicity information associated with the inverse of center frequencies, the multi-delay weighting (MDW) processing is applied to SBXCOR. In experiments, the noise robustness of SBXCOR is evaluated using a DTW word recognizer under (1) a simulated acoustic condition with white noise and (2) a real acoustic condition in a sound proof room with human speech-like noise. As the results, under the simulated acoustic condition, it is shown that SBXCOR is more robust than the conventional one-channel SBCOR, but less robust than SBCOR extracted from the two-channel-summed signal. Furthermore, by applying MDW processing, the performance of SBXCOR improved about 2% at SNR 0 dB. The resultant performance of SBXCOR with MDW processing was much better than those of smoothed group delay spectrum (SGDS) and mel-filterbank cepstral coefficient (MFCC) below SNR 10 dB. The results under the real acoustic condition were almost the same as the simulated acoustic condition.

  • Speech Enhancement Using Nonlinear Microphone Array Based on Noise Adaptive Complementary Beamforming

    Hiroshi SARUWATARI  Shoji KAJITA  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Engineering Acoustics

      Vol:
    E83-A No:5
      Page(s):
    866-876

    This paper describes an improved complementary beamforming microphone array based on the new noise adaptation algorithm. Complementary beamforming is based on two types of beamformers designed to obtain complementary directivity patterns with respect to each other. In this system, during a pause in the target speech, two directivity patterns of the beamformers are adapted to the noise directions of arrival so that the expectation values of each noise power spectrum are minimized in the array output. Using this technique, we can realize the directional nulls for each noise even when the number of sound sources exceeds that of microphones. To evaluate the effectiveness, speech enhancement experiments and speech recognition experiments are performed based on computer simulations with a two-element array and three sound sources under various noise conditions. In comparison with the conventional adaptive beamformer and the conventional spectral subtraction method cascaded with the adaptive beamformer, it is shown that (1) the proposed array improves the signal-to-noise ratio (SNR) of degraded speech by more than 6 dB when the interfering noise is two speakers with the input SNR of below 0 dB, (2) the proposed array improves the SNR by about 2 dB when the interfering noise is bubble noise, and (3) an improvement in the recognition rate of more than 18% is obtained when the interfering noise is two speakers or two overlapped signals of some speakers under the condition that the input SNR is 10 dB.