The bandwidth occupied by individual telecommunication devices in the field of mobile radio communication must be narrow in order to effectively exploit the limited frequency band. Therefore, it is necessary to implement low-bit-rate speech coding that is robust against background noise. We examine vector quantization using a neural network (NNVQ) as a robust LSP encoder. In this paper, we compare four types of binary patterns of a hidden layer, and clarify the dependency of quantization distortion on the bit pattern. By delayed decision (selection of low-distortion codes in decoding, i.e., EbD method) the spectral distortion (SD) can be decreased by 0.8 dB (20%). For noisy speech, the performance of the EbD method is better than that of the conventional VQ codebook mapping method. In addition, the SD can be decreased by 2.3 dB (40%) by using a method in which the neural networks for encoding and decoding are combined and re-trained. Finally, we examine the SD for speech having different signal-to-noise ratios (SNRs) from that used in training. The experimental results show that training using SNR between 30 and 40 dB is appropriate.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yoshinori MORITA, Tetsuo FUNADA, Hideyuki NOMURA, "Dependency of Distortion on Output Binary Pattern of the Hidden Layer for a Noisy LSP Quantization Neural Network" in IEICE TRANSACTIONS on Information,
vol. E87-D, no. 10, pp. 2348-2355, October 2004, doi: .
Abstract: The bandwidth occupied by individual telecommunication devices in the field of mobile radio communication must be narrow in order to effectively exploit the limited frequency band. Therefore, it is necessary to implement low-bit-rate speech coding that is robust against background noise. We examine vector quantization using a neural network (NNVQ) as a robust LSP encoder. In this paper, we compare four types of binary patterns of a hidden layer, and clarify the dependency of quantization distortion on the bit pattern. By delayed decision (selection of low-distortion codes in decoding, i.e., EbD method) the spectral distortion (SD) can be decreased by 0.8 dB (20%). For noisy speech, the performance of the EbD method is better than that of the conventional VQ codebook mapping method. In addition, the SD can be decreased by 2.3 dB (40%) by using a method in which the neural networks for encoding and decoding are combined and re-trained. Finally, we examine the SD for speech having different signal-to-noise ratios (SNRs) from that used in training. The experimental results show that training using SNR between 30 and 40 dB is appropriate.
URL: https://global.ieice.org/en_transactions/information/10.1587/e87-d_10_2348/_p
Copy
@ARTICLE{e87-d_10_2348,
author={Yoshinori MORITA, Tetsuo FUNADA, Hideyuki NOMURA, },
journal={IEICE TRANSACTIONS on Information},
title={Dependency of Distortion on Output Binary Pattern of the Hidden Layer for a Noisy LSP Quantization Neural Network},
year={2004},
volume={E87-D},
number={10},
pages={2348-2355},
abstract={The bandwidth occupied by individual telecommunication devices in the field of mobile radio communication must be narrow in order to effectively exploit the limited frequency band. Therefore, it is necessary to implement low-bit-rate speech coding that is robust against background noise. We examine vector quantization using a neural network (NNVQ) as a robust LSP encoder. In this paper, we compare four types of binary patterns of a hidden layer, and clarify the dependency of quantization distortion on the bit pattern. By delayed decision (selection of low-distortion codes in decoding, i.e., EbD method) the spectral distortion (SD) can be decreased by 0.8 dB (20%). For noisy speech, the performance of the EbD method is better than that of the conventional VQ codebook mapping method. In addition, the SD can be decreased by 2.3 dB (40%) by using a method in which the neural networks for encoding and decoding are combined and re-trained. Finally, we examine the SD for speech having different signal-to-noise ratios (SNRs) from that used in training. The experimental results show that training using SNR between 30 and 40 dB is appropriate.},
keywords={},
doi={},
ISSN={},
month={October},}
Copy
TY - JOUR
TI - Dependency of Distortion on Output Binary Pattern of the Hidden Layer for a Noisy LSP Quantization Neural Network
T2 - IEICE TRANSACTIONS on Information
SP - 2348
EP - 2355
AU - Yoshinori MORITA
AU - Tetsuo FUNADA
AU - Hideyuki NOMURA
PY - 2004
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E87-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2004
AB - The bandwidth occupied by individual telecommunication devices in the field of mobile radio communication must be narrow in order to effectively exploit the limited frequency band. Therefore, it is necessary to implement low-bit-rate speech coding that is robust against background noise. We examine vector quantization using a neural network (NNVQ) as a robust LSP encoder. In this paper, we compare four types of binary patterns of a hidden layer, and clarify the dependency of quantization distortion on the bit pattern. By delayed decision (selection of low-distortion codes in decoding, i.e., EbD method) the spectral distortion (SD) can be decreased by 0.8 dB (20%). For noisy speech, the performance of the EbD method is better than that of the conventional VQ codebook mapping method. In addition, the SD can be decreased by 2.3 dB (40%) by using a method in which the neural networks for encoding and decoding are combined and re-trained. Finally, we examine the SD for speech having different signal-to-noise ratios (SNRs) from that used in training. The experimental results show that training using SNR between 30 and 40 dB is appropriate.
ER -