Cepstral Amplitude Range Normalization for Noise Robust Speech Recognition

Shingo YOSHIZAWA; Noboru HAYASAKA; Naoya WADA; Yoshikazu MIYANAGA

IEICE TRANSACTIONS on Information

Cepstral Amplitude Range Normalization for Noise Robust Speech Recognition

Shingo YOSHIZAWA, Noboru HAYASAKA, Naoya WADA, Yoshikazu MIYANAGA

Full Text Views

0

Cite this

Summary :

This paper describes a noise robustness technique that normalizes the cepstral amplitude range in order to remove the influence of additive noise. Additive noise causes speech feature mismatches between testing and training environments and it degrades recognition accuracy in noisy environments. We presume an approximate model that expresses the influence by changing the amplitude range and the DC component in the log-spectra. According to this model, we propose a cepstral amplitude range normalization (CARN) that normalizes the cepstral distance between maximum and minimum values. It can estimate noise robust features without prior knowledge or adaptation. We evaluated its performance in an isolated word recognition task by using the Noisex92 database. Compared with the combinations of conventional methods, the CARN could improve recognition accuracy under various SNR conditions.

Publication: IEICE TRANSACTIONS on Information Vol.E87-D No.8 pp.2130-2137

Publication Date: 2004/08/01

Publicized

Online ISSN

DOI

Type of Manuscript: PAPER

Category: Speech and Hearing

Cite this

Copy

Shingo YOSHIZAWA, Noboru HAYASAKA, Naoya WADA, Yoshikazu MIYANAGA, "Cepstral Amplitude Range Normalization for Noise Robust Speech Recognition" in IEICE TRANSACTIONS on Information, vol. E87-D, no. 8, pp. 2130-2137, August 2004, doi: .
Abstract: This paper describes a noise robustness technique that normalizes the cepstral amplitude range in order to remove the influence of additive noise. Additive noise causes speech feature mismatches between testing and training environments and it degrades recognition accuracy in noisy environments. We presume an approximate model that expresses the influence by changing the amplitude range and the DC component in the log-spectra. According to this model, we propose a cepstral amplitude range normalization (CARN) that normalizes the cepstral distance between maximum and minimum values. It can estimate noise robust features without prior knowledge or adaptation. We evaluated its performance in an isolated word recognition task by using the Noisex92 database. Compared with the combinations of conventional methods, the CARN could improve recognition accuracy under various SNR conditions.
URL: https://global.ieice.org/en_transactions/information/10.1587/e87-d_8_2130/_p

Copy

@ARTICLE{e87-d_8_2130,
author={Shingo YOSHIZAWA, Noboru HAYASAKA, Naoya WADA, Yoshikazu MIYANAGA, },
journal={IEICE TRANSACTIONS on Information},
title={Cepstral Amplitude Range Normalization for Noise Robust Speech Recognition},
year={2004},
volume={E87-D},
number={8},
pages={2130-2137},
abstract={This paper describes a noise robustness technique that normalizes the cepstral amplitude range in order to remove the influence of additive noise. Additive noise causes speech feature mismatches between testing and training environments and it degrades recognition accuracy in noisy environments. We presume an approximate model that expresses the influence by changing the amplitude range and the DC component in the log-spectra. According to this model, we propose a cepstral amplitude range normalization (CARN) that normalizes the cepstral distance between maximum and minimum values. It can estimate noise robust features without prior knowledge or adaptation. We evaluated its performance in an isolated word recognition task by using the Noisex92 database. Compared with the combinations of conventional methods, the CARN could improve recognition accuracy under various SNR conditions.},
keywords={},
doi={},
ISSN={},
month={August},}

Copy

TY - JOUR
TI - Cepstral Amplitude Range Normalization for Noise Robust Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 2130
EP - 2137
AU - Shingo YOSHIZAWA
AU - Noboru HAYASAKA
AU - Naoya WADA
AU - Yoshikazu MIYANAGA
PY - 2004
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E87-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2004
AB - This paper describes a noise robustness technique that normalizes the cepstral amplitude range in order to remove the influence of additive noise. Additive noise causes speech feature mismatches between testing and training environments and it degrades recognition accuracy in noisy environments. We presume an approximate model that expresses the influence by changing the amplitude range and the DC component in the log-spectra. According to this model, we propose a cepstral amplitude range normalization (CARN) that normalizes the cepstral distance between maximum and minimum values. It can estimate noise robust features without prior knowledge or adaptation. We evaluated its performance in an isolated word recognition task by using the Noisex92 database. Compared with the combinations of conventional methods, the CARN could improve recognition accuracy under various SNR conditions.
ER -

IEICE TRANSACTIONS on Information

Cepstral Amplitude Range Normalization for Noise Robust Speech Recognition

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Cepstral Amplitude Range Normalization for Noise Robust Speech Recognition

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles