4 kbps Improved Pitch Prediction CELP Speech Coding with 20 msec Frame

Masahiro SERIZAWA; Kazunori OZAWA

IEICE TRANSACTIONS on Information

4 kbps Improved Pitch Prediction CELP Speech Coding with 20 msec Frame

Masahiro SERIZAWA, Kazunori OZAWA

Full Text Views

0

Cite this

Summary :

This paper proposes a new pitch prediction method for 4 kbps CELP (Code Excited LPC) speech coding with 20 msec frame, for the future ITU-T 4 kbps speech coding standardization. In the conventional CELP speech coding, synthetic speech quality deteriorates rapidly at 4 kbps, especially for female and children's speech with short pitch period. The pitch prediction performance is significantly degraded for such speech. The important reason is that when the pitch period is shorter than the subframe length, the simple repetition of the past excitation signal based on the estimated lag, not the pitch prediction, is usually carried out in the adaptive codebook operation. The proposed pitch prediction method can carry out the pitch prediction without the above approximation by utilizing the current subframe excitation codevector signal, when the pitch prediction parameters are determined. To further improve the performance, a split vector synthesis and perceptually spectral weighting method, and a low-complexity perceptually harmonic and spectral weighting method have also been developed. The informal listening test result shows that the 4 kbps speech coder with 20 msec frame, utilizing all of the proposed improvements, achieves 0.2 MOS higher results than the coder without them.

Publication: IEICE TRANSACTIONS on Information Vol.E78-D No.6 pp.758-763

Publication Date: 1995/06/25

Publicized

Online ISSN

DOI

Type of Manuscript: Special Section PAPER (Special Issue on Spoken Language Processing)

Category

Cite this

Copy

Masahiro SERIZAWA, Kazunori OZAWA, "4 kbps Improved Pitch Prediction CELP Speech Coding with 20 msec Frame" in IEICE TRANSACTIONS on Information, vol. E78-D, no. 6, pp. 758-763, June 1995, doi: .
Abstract: This paper proposes a new pitch prediction method for 4 kbps CELP (Code Excited LPC) speech coding with 20 msec frame, for the future ITU-T 4 kbps speech coding standardization. In the conventional CELP speech coding, synthetic speech quality deteriorates rapidly at 4 kbps, especially for female and children's speech with short pitch period. The pitch prediction performance is significantly degraded for such speech. The important reason is that when the pitch period is shorter than the subframe length, the simple repetition of the past excitation signal based on the estimated lag, not the pitch prediction, is usually carried out in the adaptive codebook operation. The proposed pitch prediction method can carry out the pitch prediction without the above approximation by utilizing the current subframe excitation codevector signal, when the pitch prediction parameters are determined. To further improve the performance, a split vector synthesis and perceptually spectral weighting method, and a low-complexity perceptually harmonic and spectral weighting method have also been developed. The informal listening test result shows that the 4 kbps speech coder with 20 msec frame, utilizing all of the proposed improvements, achieves 0.2 MOS higher results than the coder without them.
URL: https://global.ieice.org/en_transactions/information/10.1587/e78-d_6_758/_p

Copy

@ARTICLE{e78-d_6_758,
author={Masahiro SERIZAWA, Kazunori OZAWA, },
journal={IEICE TRANSACTIONS on Information},
title={4 kbps Improved Pitch Prediction CELP Speech Coding with 20 msec Frame},
year={1995},
volume={E78-D},
number={6},
pages={758-763},
abstract={This paper proposes a new pitch prediction method for 4 kbps CELP (Code Excited LPC) speech coding with 20 msec frame, for the future ITU-T 4 kbps speech coding standardization. In the conventional CELP speech coding, synthetic speech quality deteriorates rapidly at 4 kbps, especially for female and children's speech with short pitch period. The pitch prediction performance is significantly degraded for such speech. The important reason is that when the pitch period is shorter than the subframe length, the simple repetition of the past excitation signal based on the estimated lag, not the pitch prediction, is usually carried out in the adaptive codebook operation. The proposed pitch prediction method can carry out the pitch prediction without the above approximation by utilizing the current subframe excitation codevector signal, when the pitch prediction parameters are determined. To further improve the performance, a split vector synthesis and perceptually spectral weighting method, and a low-complexity perceptually harmonic and spectral weighting method have also been developed. The informal listening test result shows that the 4 kbps speech coder with 20 msec frame, utilizing all of the proposed improvements, achieves 0.2 MOS higher results than the coder without them.},
keywords={},
doi={},
ISSN={},
month={June},}

Copy

TY - JOUR
TI - 4 kbps Improved Pitch Prediction CELP Speech Coding with 20 msec Frame
T2 - IEICE TRANSACTIONS on Information
SP - 758
EP - 763
AU - Masahiro SERIZAWA
AU - Kazunori OZAWA
PY - 1995
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E78-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 1995
AB - This paper proposes a new pitch prediction method for 4 kbps CELP (Code Excited LPC) speech coding with 20 msec frame, for the future ITU-T 4 kbps speech coding standardization. In the conventional CELP speech coding, synthetic speech quality deteriorates rapidly at 4 kbps, especially for female and children's speech with short pitch period. The pitch prediction performance is significantly degraded for such speech. The important reason is that when the pitch period is shorter than the subframe length, the simple repetition of the past excitation signal based on the estimated lag, not the pitch prediction, is usually carried out in the adaptive codebook operation. The proposed pitch prediction method can carry out the pitch prediction without the above approximation by utilizing the current subframe excitation codevector signal, when the pitch prediction parameters are determined. To further improve the performance, a split vector synthesis and perceptually spectral weighting method, and a low-complexity perceptually harmonic and spectral weighting method have also been developed. The informal listening test result shows that the 4 kbps speech coder with 20 msec frame, utilizing all of the proposed improvements, achieves 0.2 MOS higher results than the coder without them.
ER -

IEICE TRANSACTIONS on Information

4 kbps Improved Pitch Prediction CELP Speech Coding with 20 msec Frame

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

4 kbps Improved Pitch Prediction CELP Speech Coding with 20 msec Frame

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles