Signal Processing Representations of Speech

W.  Bastiaan KLEIJN

Signal Processing Representations of Speech

W. Bastiaan KLEIJN

Full Text Views

0

Cite this

Summary :

Synergies in processing requirements and knowledge of human speech production and perception have led to a similarity of the speech signal representations used for the tasks of recognition, coding, and modification. The representations are generally composed of a description of the vocal-tract transfer function and, in the case of coding and modification, a description of the excitation signal. This paper provides an overview of commonly used representations. For coding and modification, autoregressive models represented by line spectral frequencies perform well for the vocal tract, and pitch-synchronous filter banks and modulation-domain filters perform well for the excitation. For recognition, good representations are based on a smoothed magnitude response of the vocal tract.

Publication: IEICE TRANSACTIONS on Information Vol.E86-D No.3 pp.359-376

Publication Date: 2003/03/01

Publicized

Online ISSN

DOI

Type of Manuscript: INVITED SURVEY PAPER

Category

Cite this

Copy

W. Bastiaan KLEIJN, "Signal Processing Representations of Speech" in IEICE TRANSACTIONS on Information, vol. E86-D, no. 3, pp. 359-376, March 2003, doi: .
Abstract: Synergies in processing requirements and knowledge of human speech production and perception have led to a similarity of the speech signal representations used for the tasks of recognition, coding, and modification. The representations are generally composed of a description of the vocal-tract transfer function and, in the case of coding and modification, a description of the excitation signal. This paper provides an overview of commonly used representations. For coding and modification, autoregressive models represented by line spectral frequencies perform well for the vocal tract, and pitch-synchronous filter banks and modulation-domain filters perform well for the excitation. For recognition, good representations are based on a smoothed magnitude response of the vocal tract.
URL: https://global.ieice.org/en_transactions/information/10.1587/e86-d_3_359/_p

Copy

@ARTICLE{e86-d_3_359,
author={W. Bastiaan KLEIJN, },
journal={IEICE TRANSACTIONS on Information},
title={Signal Processing Representations of Speech},
year={2003},
volume={E86-D},
number={3},
pages={359-376},
abstract={Synergies in processing requirements and knowledge of human speech production and perception have led to a similarity of the speech signal representations used for the tasks of recognition, coding, and modification. The representations are generally composed of a description of the vocal-tract transfer function and, in the case of coding and modification, a description of the excitation signal. This paper provides an overview of commonly used representations. For coding and modification, autoregressive models represented by line spectral frequencies perform well for the vocal tract, and pitch-synchronous filter banks and modulation-domain filters perform well for the excitation. For recognition, good representations are based on a smoothed magnitude response of the vocal tract.},
keywords={},
doi={},
ISSN={},
month={March},}

Copy

TY - JOUR
TI - Signal Processing Representations of Speech
T2 - IEICE TRANSACTIONS on Information
SP - 359
EP - 376
AU - W. Bastiaan KLEIJN
PY - 2003
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E86-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2003
AB - Synergies in processing requirements and knowledge of human speech production and perception have led to a similarity of the speech signal representations used for the tasks of recognition, coding, and modification. The representations are generally composed of a description of the vocal-tract transfer function and, in the case of coding and modification, a description of the excitation signal. This paper provides an overview of commonly used representations. For coding and modification, autoregressive models represented by line spectral frequencies perform well for the vocal tract, and pitch-synchronous filter banks and modulation-domain filters perform well for the excitation. For recognition, good representations are based on a smoothed magnitude response of the vocal tract.
ER -