High Quality and Low Complexity Speech Analysis/Synthesis Based on Sinusoidal Representation

Jianguo TAN; Wenjun ZHANG; Peilin LIU

doi:10.1093/ietisy/e88-d.12.2893

IEICE TRANSACTIONS on Information

High Quality and Low Complexity Speech Analysis/Synthesis Based on Sinusoidal Representation

Jianguo TAN, Wenjun ZHANG, Peilin LIU

Full Text Views

0

Cite this

Summary :

Sinusoidal representation has been widely applied to speech modification, low bit rate speech and audio coding. Usually, speech signal is analyzed and synthesized using the overlap-add algorithm or the peak-picking algorithm. But the overlap-add algorithm is well known for high computational complexity and the peak-picking algorithm cannot track the transient and syllabic variation well. In this letter, both algorithms are applied to speech analysis/synthesis. Peaks are picked in the curve of power spectral density for speech signal; the frequencies corresponding to these peaks are arranged according to the descending orders of their corresponding power spectral densities. These frequencies are regarded as the candidate frequencies to determine the corresponding amplitudes and initial phases according to the least mean square error criterion. The summation of the extracted sinusoidal components is used to successively approach the original speech signal. The results show that the proposed algorithm can track the transient and syllabic variation and can attain the good synthesized speech signal with low computational complexity.

Publication: IEICE TRANSACTIONS on Information Vol.E88-D No.12 pp.2893-2896

Publication Date: 2005/12/01

Publicized

Online ISSN

DOI: 10.1093/ietisy/e88-d.12.2893

Type of Manuscript: LETTER

Category: Speech and Hearing

Cite this

Copy

Jianguo TAN, Wenjun ZHANG, Peilin LIU, "High Quality and Low Complexity Speech Analysis/Synthesis Based on Sinusoidal Representation" in IEICE TRANSACTIONS on Information, vol. E88-D, no. 12, pp. 2893-2896, December 2005, doi: 10.1093/ietisy/e88-d.12.2893.
Abstract: Sinusoidal representation has been widely applied to speech modification, low bit rate speech and audio coding. Usually, speech signal is analyzed and synthesized using the overlap-add algorithm or the peak-picking algorithm. But the overlap-add algorithm is well known for high computational complexity and the peak-picking algorithm cannot track the transient and syllabic variation well. In this letter, both algorithms are applied to speech analysis/synthesis. Peaks are picked in the curve of power spectral density for speech signal; the frequencies corresponding to these peaks are arranged according to the descending orders of their corresponding power spectral densities. These frequencies are regarded as the candidate frequencies to determine the corresponding amplitudes and initial phases according to the least mean square error criterion. The summation of the extracted sinusoidal components is used to successively approach the original speech signal. The results show that the proposed algorithm can track the transient and syllabic variation and can attain the good synthesized speech signal with low computational complexity.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e88-d.12.2893/_p

Copy

@ARTICLE{e88-d_12_2893,
author={Jianguo TAN, Wenjun ZHANG, Peilin LIU, },
journal={IEICE TRANSACTIONS on Information},
title={High Quality and Low Complexity Speech Analysis/Synthesis Based on Sinusoidal Representation},
year={2005},
volume={E88-D},
number={12},
pages={2893-2896},
abstract={Sinusoidal representation has been widely applied to speech modification, low bit rate speech and audio coding. Usually, speech signal is analyzed and synthesized using the overlap-add algorithm or the peak-picking algorithm. But the overlap-add algorithm is well known for high computational complexity and the peak-picking algorithm cannot track the transient and syllabic variation well. In this letter, both algorithms are applied to speech analysis/synthesis. Peaks are picked in the curve of power spectral density for speech signal; the frequencies corresponding to these peaks are arranged according to the descending orders of their corresponding power spectral densities. These frequencies are regarded as the candidate frequencies to determine the corresponding amplitudes and initial phases according to the least mean square error criterion. The summation of the extracted sinusoidal components is used to successively approach the original speech signal. The results show that the proposed algorithm can track the transient and syllabic variation and can attain the good synthesized speech signal with low computational complexity.},
keywords={},
doi={10.1093/ietisy/e88-d.12.2893},
ISSN={},
month={December},}

Copy

TY - JOUR
TI - High Quality and Low Complexity Speech Analysis/Synthesis Based on Sinusoidal Representation
T2 - IEICE TRANSACTIONS on Information
SP - 2893
EP - 2896
AU - Jianguo TAN
AU - Wenjun ZHANG
AU - Peilin LIU
PY - 2005
DO - 10.1093/ietisy/e88-d.12.2893
JO - IEICE TRANSACTIONS on Information
SN -
VL - E88-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2005
AB - Sinusoidal representation has been widely applied to speech modification, low bit rate speech and audio coding. Usually, speech signal is analyzed and synthesized using the overlap-add algorithm or the peak-picking algorithm. But the overlap-add algorithm is well known for high computational complexity and the peak-picking algorithm cannot track the transient and syllabic variation well. In this letter, both algorithms are applied to speech analysis/synthesis. Peaks are picked in the curve of power spectral density for speech signal; the frequencies corresponding to these peaks are arranged according to the descending orders of their corresponding power spectral densities. These frequencies are regarded as the candidate frequencies to determine the corresponding amplitudes and initial phases according to the least mean square error criterion. The summation of the extracted sinusoidal components is used to successively approach the original speech signal. The results show that the proposed algorithm can track the transient and syllabic variation and can attain the good synthesized speech signal with low computational complexity.
ER -

IEICE TRANSACTIONS on Information

High Quality and Low Complexity Speech Analysis/Synthesis Based on Sinusoidal Representation

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

High Quality and Low Complexity Speech Analysis/Synthesis Based on Sinusoidal Representation

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles