Noise Robust Speech Recognition Using Subband-Crosscorrelation Analysis

Shoji KAJITA; Kazuya TAKEDA; Fumitada ITAKURA

Noise Robust Speech Recognition Using Subband-Crosscorrelation Analysis

Shoji KAJITA, Kazuya TAKEDA, Fumitada ITAKURA

Full Text Views

0

Cite this

Summary :

This paper describes subband-crosscorrelation analysis (SBXCOR) using two input channel signals. SBXCOR is an extended signal processing technique of subband-autocorrelation analysis (SBCOR) that extracts periodicities associated with the inverse of center frequencies present in speech signals. In addition, to extract more periodicity information associated with the inverse of center frequencies, the multi-delay weighting (MDW) processing is applied to SBXCOR. In experiments, the noise robustness of SBXCOR is evaluated using a DTW word recognizer under (1) a simulated acoustic condition with white noise and (2) a real acoustic condition in a sound proof room with human speech-like noise. As the results, under the simulated acoustic condition, it is shown that SBXCOR is more robust than the conventional one-channel SBCOR, but less robust than SBCOR extracted from the two-channel-summed signal. Furthermore, by applying MDW processing, the performance of SBXCOR improved about 2% at SNR 0 dB. The resultant performance of SBXCOR with MDW processing was much better than those of smoothed group delay spectrum (SGDS) and mel-filterbank cepstral coefficient (MFCC) below SNR 10 dB. The results under the real acoustic condition were almost the same as the simulated acoustic condition.

Publication: IEICE TRANSACTIONS on Information Vol.E81-D No.10 pp.1079-1086

Publication Date: 1998/10/25

Publicized

Online ISSN

DOI

Type of Manuscript

Category: Speech Processing and Acoustics

Cite this

Copy

Shoji KAJITA, Kazuya TAKEDA, Fumitada ITAKURA, "Noise Robust Speech Recognition Using Subband-Crosscorrelation Analysis" in IEICE TRANSACTIONS on Information, vol. E81-D, no. 10, pp. 1079-1086, October 1998, doi: .
Abstract: This paper describes subband-crosscorrelation analysis (SBXCOR) using two input channel signals. SBXCOR is an extended signal processing technique of subband-autocorrelation analysis (SBCOR) that extracts periodicities associated with the inverse of center frequencies present in speech signals. In addition, to extract more periodicity information associated with the inverse of center frequencies, the multi-delay weighting (MDW) processing is applied to SBXCOR. In experiments, the noise robustness of SBXCOR is evaluated using a DTW word recognizer under (1) a simulated acoustic condition with white noise and (2) a real acoustic condition in a sound proof room with human speech-like noise. As the results, under the simulated acoustic condition, it is shown that SBXCOR is more robust than the conventional one-channel SBCOR, but less robust than SBCOR extracted from the two-channel-summed signal. Furthermore, by applying MDW processing, the performance of SBXCOR improved about 2% at SNR 0 dB. The resultant performance of SBXCOR with MDW processing was much better than those of smoothed group delay spectrum (SGDS) and mel-filterbank cepstral coefficient (MFCC) below SNR 10 dB. The results under the real acoustic condition were almost the same as the simulated acoustic condition.
URL: https://global.ieice.org/en_transactions/information/10.1587/e81-d_10_1079/_p

Copy

@ARTICLE{e81-d_10_1079,
author={Shoji KAJITA, Kazuya TAKEDA, Fumitada ITAKURA, },
journal={IEICE TRANSACTIONS on Information},
title={Noise Robust Speech Recognition Using Subband-Crosscorrelation Analysis},
year={1998},
volume={E81-D},
number={10},
pages={1079-1086},
abstract={This paper describes subband-crosscorrelation analysis (SBXCOR) using two input channel signals. SBXCOR is an extended signal processing technique of subband-autocorrelation analysis (SBCOR) that extracts periodicities associated with the inverse of center frequencies present in speech signals. In addition, to extract more periodicity information associated with the inverse of center frequencies, the multi-delay weighting (MDW) processing is applied to SBXCOR. In experiments, the noise robustness of SBXCOR is evaluated using a DTW word recognizer under (1) a simulated acoustic condition with white noise and (2) a real acoustic condition in a sound proof room with human speech-like noise. As the results, under the simulated acoustic condition, it is shown that SBXCOR is more robust than the conventional one-channel SBCOR, but less robust than SBCOR extracted from the two-channel-summed signal. Furthermore, by applying MDW processing, the performance of SBXCOR improved about 2% at SNR 0 dB. The resultant performance of SBXCOR with MDW processing was much better than those of smoothed group delay spectrum (SGDS) and mel-filterbank cepstral coefficient (MFCC) below SNR 10 dB. The results under the real acoustic condition were almost the same as the simulated acoustic condition.},
keywords={},
doi={},
ISSN={},
month={October},}

Copy

TY - JOUR
TI - Noise Robust Speech Recognition Using Subband-Crosscorrelation Analysis
T2 - IEICE TRANSACTIONS on Information
SP - 1079
EP - 1086
AU - Shoji KAJITA
AU - Kazuya TAKEDA
AU - Fumitada ITAKURA
PY - 1998
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E81-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 1998
AB - This paper describes subband-crosscorrelation analysis (SBXCOR) using two input channel signals. SBXCOR is an extended signal processing technique of subband-autocorrelation analysis (SBCOR) that extracts periodicities associated with the inverse of center frequencies present in speech signals. In addition, to extract more periodicity information associated with the inverse of center frequencies, the multi-delay weighting (MDW) processing is applied to SBXCOR. In experiments, the noise robustness of SBXCOR is evaluated using a DTW word recognizer under (1) a simulated acoustic condition with white noise and (2) a real acoustic condition in a sound proof room with human speech-like noise. As the results, under the simulated acoustic condition, it is shown that SBXCOR is more robust than the conventional one-channel SBCOR, but less robust than SBCOR extracted from the two-channel-summed signal. Furthermore, by applying MDW processing, the performance of SBXCOR improved about 2% at SNR 0 dB. The resultant performance of SBXCOR with MDW processing was much better than those of smoothed group delay spectrum (SGDS) and mel-filterbank cepstral coefficient (MFCC) below SNR 10 dB. The results under the real acoustic condition were almost the same as the simulated acoustic condition.
ER -