The search functionality is under construction.

IEICE TRANSACTIONS on Fundamentals

Low-Complexity and Accurate Noise Suppression Based on an a Priori SNR Model for Robust Speech Recognition on Embedded Systems and Its Evaluation in a Car Environment

Masanori TSUJIKAWA, Yoshinobu KAJIKAWA

  • Full Text Views

    0

  • Cite this

Summary :

In this paper, we propose a low-complexity and accurate noise suppression based on an a priori SNR (Speech to Noise Ratio) model for greater robustness w.r.t. short-term noise-fluctuation. The a priori SNR, the ratio of speech spectra and noise spectra in the spectral domain, represents the difference between speech features and noise features in the feature domain, including the mel-cepstral domain and the logarithmic power spectral domain. This is because logarithmic operations are used for domain conversions. Therefore, an a priori SNR model can easily be expressed in terms of the difference between the speech model and the noise model, which are modeled by the Gaussian mixture models, and it can be generated with low computational cost. By using a priori SNRs accurately estimated on the basis of an a priori SNR model, it is possible to calculate accurate coefficients of noise suppression filters taking into account the variance of noise, without serious increase in computational cost over that of a conventional model-based Wiener filter (MBW). We have conducted in-car speech recognition evaluation using the CENSREC-2 database, and a comparison of the proposed method with a conventional MBW showed that the recognition error rate for all noise environments was reduced by 9%, and that, notably, that for audio-noise environments was reduced by 11%. We show that the proposed method can be processed with low levels of computational and memory resources through implementation on a digital signal processor.

Publication
IEICE TRANSACTIONS on Fundamentals Vol.E106-A No.9 pp.1224-1233
Publication Date
2023/09/01
Publicized
2023/02/28
Online ISSN
1745-1337
DOI
10.1587/transfun.2022EAP1130
Type of Manuscript
PAPER
Category
Digital Signal Processing

Authors

Masanori TSUJIKAWA
  Kansai University,NEC Corporation
Yoshinobu KAJIKAWA
  Kansai University

Keyword