The search functionality is under construction.
The search functionality is under construction.

High Quality Speech Synthesis Based on the Reproduction of the Randomness in Speech Signals

Naofumi AOKI

  • Full Text Views

    0

  • Cite this

Summary :

A high quality speech synthesis technique based on the wavelet subband analysis of speech signals was newly devised for enhancing the naturalness of synthesized voiced consonant speech. The technique reproduces a speech characteristic of voiced consonant speech that shows unvoiced feature remarkably in the high frequency subbands. For mixing appropriately the unvoiced feature into voiced speech, a noise inclusion procedure that employed the discrete wavelet transform was proposed. This paper also describes a developed speech synthesizer that employs several random fractal techniques. These techniques were employed for enhancing especially the naturalness of synthesized purely voiced speech. Three types of fluctuations, (1) pitch period fluctuation, (2) amplitude fluctuation, and (3) waveform fluctuation were treated in the speech synthesizer. In addition, instead of a normal impulse train, a triangular pulse was used as a simple model for the glottal excitation pulse. For the compensation for the degraded frequency characteristic of the triangular pulse that overdecreases than the spectral -6 dB/oct characteristic required for the glottal excitation pulse, the random fractal interpolation technique was applied. In order to evaluate the developed speech synthesis system, psychoacoustic experiments were carried out. The experiments especially focused on how the mixed excitation scheme effectively contributed to enhancing the naturalness of voiced consonant speech. In spite that the proposed techniques were just a little modification for enhancing the conventional LPC (linear predictive coding) speech synthesizer, the subjective evaluation suggested that the system could effectively gain the naturalness of the synthesized speech that tended to degrade in the conventional LPC speech synthesis scheme.

Publication
IEICE TRANSACTIONS on Fundamentals Vol.E84-A No.9 pp.2198-2206
Publication Date
2001/09/01
Publicized
Online ISSN
DOI
Type of Manuscript
Special Section PAPER (Special Section on Nonlinear Theory and its Applications)
Category
Image & Signal Processing

Authors

Keyword