A speech analysis/synthesis system that relies on a time-varying Auto Regressive Moving Average (ARMA) process and the Short-Time Fourier Transform (STFT) is proposed. The narrowband components in speech are represented in the frequency domain by a set of harmonic components, while the broadband random components are represented by a time-varying ARMA process. The time-varying ARMA model has a dual function, namely, it creates a spectral envelope that fits accurately the harmonic STFT components, and provides for the spectral representation of the broadband components of speech. The proposed model essentially combines the features of waveform coders by employing the STFT and the features of traditional vocoders by incorporating an appropriately shaped noise sequence.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Andreas SPANIAS, Philipos LOIZOU, Gim LIM, Ye CHEN, Gen HU, "Analysis/Synthesis of Speech Using the Short-Time Fourier Transform and a Time-Varying ARMA Process" in IEICE TRANSACTIONS on Fundamentals,
vol. E76-A, no. 4, pp. 645-652, April 1993, doi: .
Abstract: A speech analysis/synthesis system that relies on a time-varying Auto Regressive Moving Average (ARMA) process and the Short-Time Fourier Transform (STFT) is proposed. The narrowband components in speech are represented in the frequency domain by a set of harmonic components, while the broadband random components are represented by a time-varying ARMA process. The time-varying ARMA model has a dual function, namely, it creates a spectral envelope that fits accurately the harmonic STFT components, and provides for the spectral representation of the broadband components of speech. The proposed model essentially combines the features of waveform coders by employing the STFT and the features of traditional vocoders by incorporating an appropriately shaped noise sequence.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e76-a_4_645/_p
Copy
@ARTICLE{e76-a_4_645,
author={Andreas SPANIAS, Philipos LOIZOU, Gim LIM, Ye CHEN, Gen HU, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Analysis/Synthesis of Speech Using the Short-Time Fourier Transform and a Time-Varying ARMA Process},
year={1993},
volume={E76-A},
number={4},
pages={645-652},
abstract={A speech analysis/synthesis system that relies on a time-varying Auto Regressive Moving Average (ARMA) process and the Short-Time Fourier Transform (STFT) is proposed. The narrowband components in speech are represented in the frequency domain by a set of harmonic components, while the broadband random components are represented by a time-varying ARMA process. The time-varying ARMA model has a dual function, namely, it creates a spectral envelope that fits accurately the harmonic STFT components, and provides for the spectral representation of the broadband components of speech. The proposed model essentially combines the features of waveform coders by employing the STFT and the features of traditional vocoders by incorporating an appropriately shaped noise sequence.},
keywords={},
doi={},
ISSN={},
month={April},}
Copy
TY - JOUR
TI - Analysis/Synthesis of Speech Using the Short-Time Fourier Transform and a Time-Varying ARMA Process
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 645
EP - 652
AU - Andreas SPANIAS
AU - Philipos LOIZOU
AU - Gim LIM
AU - Ye CHEN
AU - Gen HU
PY - 1993
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E76-A
IS - 4
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - April 1993
AB - A speech analysis/synthesis system that relies on a time-varying Auto Regressive Moving Average (ARMA) process and the Short-Time Fourier Transform (STFT) is proposed. The narrowband components in speech are represented in the frequency domain by a set of harmonic components, while the broadband random components are represented by a time-varying ARMA process. The time-varying ARMA model has a dual function, namely, it creates a spectral envelope that fits accurately the harmonic STFT components, and provides for the spectral representation of the broadband components of speech. The proposed model essentially combines the features of waveform coders by employing the STFT and the features of traditional vocoders by incorporating an appropriately shaped noise sequence.
ER -