The search functionality is under construction.

The search functionality is under construction.

A speaker adaptation technique that maximizes the observation probability of an input speech is proposed. It is applied to semi-continuous hidden Markov model (SCHMM) speech recognizers. The proposed algorithm adapts the mean µ and the covariance Σ iteratively by the gradient search technique so that the features of the adaptation speech data could achieve maximum observation probabilities. The mixture coefficients and the state transition probabilities are adapted by the model interpolation scheme. The main advantage of this scheme is that the means and the variances, which are common to all states in SCHMM, are adapted independently from the other parameters of SCHMM. It allows fast and precise adaptation especially when there is a large acoustic mismatch between the reference model and a new speaker. Also, it is possible that this scheme could be adopted to other areas which use codebook. The proposed adaptation algorithm was evaluated by a male speaker-dependent, a female speaker-dependent, and a speaker-independent recognizers. The experimental results on the isolated word recognition showed that the proposed adaptation algorithm achieved 46.03% average enhancement in the male speaker-dependent recognizer, 52.18% in the female speaker-dependent recognizer, and 9.84% in the speaker-independent recognizer.

- Publication
- IEICE TRANSACTIONS on Information Vol.E84-D No.2 pp.286-288

- Publication Date
- 2001/02/01

- Publicized

- Online ISSN

- DOI

- Type of Manuscript
- LETTER

- Category
- Speech and Hearing

The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.

Copy

Tae-Young YANG, Chungyong LEE, Dae-Hee YOUN, "Speaker Adaptation Based on a Maximum Observation Probability Criterion" in IEICE TRANSACTIONS on Information,
vol. E84-D, no. 2, pp. 286-288, February 2001, doi: .

Abstract: A speaker adaptation technique that maximizes the observation probability of an input speech is proposed. It is applied to semi-continuous hidden Markov model (SCHMM) speech recognizers. The proposed algorithm adapts the mean µ and the covariance Σ iteratively by the gradient search technique so that the features of the adaptation speech data could achieve maximum observation probabilities. The mixture coefficients and the state transition probabilities are adapted by the model interpolation scheme. The main advantage of this scheme is that the means and the variances, which are common to all states in SCHMM, are adapted independently from the other parameters of SCHMM. It allows fast and precise adaptation especially when there is a large acoustic mismatch between the reference model and a new speaker. Also, it is possible that this scheme could be adopted to other areas which use codebook. The proposed adaptation algorithm was evaluated by a male speaker-dependent, a female speaker-dependent, and a speaker-independent recognizers. The experimental results on the isolated word recognition showed that the proposed adaptation algorithm achieved 46.03% average enhancement in the male speaker-dependent recognizer, 52.18% in the female speaker-dependent recognizer, and 9.84% in the speaker-independent recognizer.

URL: https://global.ieice.org/en_transactions/information/10.1587/e84-d_2_286/_p

Copy

@ARTICLE{e84-d_2_286,

author={Tae-Young YANG, Chungyong LEE, Dae-Hee YOUN, },

journal={IEICE TRANSACTIONS on Information},

title={Speaker Adaptation Based on a Maximum Observation Probability Criterion},

year={2001},

volume={E84-D},

number={2},

pages={286-288},

abstract={A speaker adaptation technique that maximizes the observation probability of an input speech is proposed. It is applied to semi-continuous hidden Markov model (SCHMM) speech recognizers. The proposed algorithm adapts the mean µ and the covariance Σ iteratively by the gradient search technique so that the features of the adaptation speech data could achieve maximum observation probabilities. The mixture coefficients and the state transition probabilities are adapted by the model interpolation scheme. The main advantage of this scheme is that the means and the variances, which are common to all states in SCHMM, are adapted independently from the other parameters of SCHMM. It allows fast and precise adaptation especially when there is a large acoustic mismatch between the reference model and a new speaker. Also, it is possible that this scheme could be adopted to other areas which use codebook. The proposed adaptation algorithm was evaluated by a male speaker-dependent, a female speaker-dependent, and a speaker-independent recognizers. The experimental results on the isolated word recognition showed that the proposed adaptation algorithm achieved 46.03% average enhancement in the male speaker-dependent recognizer, 52.18% in the female speaker-dependent recognizer, and 9.84% in the speaker-independent recognizer.},

keywords={},

doi={},

ISSN={},

month={February},}

Copy

TY - JOUR

TI - Speaker Adaptation Based on a Maximum Observation Probability Criterion

T2 - IEICE TRANSACTIONS on Information

SP - 286

EP - 288

AU - Tae-Young YANG

AU - Chungyong LEE

AU - Dae-Hee YOUN

PY - 2001

DO -

JO - IEICE TRANSACTIONS on Information

SN -

VL - E84-D

IS - 2

JA - IEICE TRANSACTIONS on Information

Y1 - February 2001

AB - A speaker adaptation technique that maximizes the observation probability of an input speech is proposed. It is applied to semi-continuous hidden Markov model (SCHMM) speech recognizers. The proposed algorithm adapts the mean µ and the covariance Σ iteratively by the gradient search technique so that the features of the adaptation speech data could achieve maximum observation probabilities. The mixture coefficients and the state transition probabilities are adapted by the model interpolation scheme. The main advantage of this scheme is that the means and the variances, which are common to all states in SCHMM, are adapted independently from the other parameters of SCHMM. It allows fast and precise adaptation especially when there is a large acoustic mismatch between the reference model and a new speaker. Also, it is possible that this scheme could be adopted to other areas which use codebook. The proposed adaptation algorithm was evaluated by a male speaker-dependent, a female speaker-dependent, and a speaker-independent recognizers. The experimental results on the isolated word recognition showed that the proposed adaptation algorithm achieved 46.03% average enhancement in the male speaker-dependent recognizer, 52.18% in the female speaker-dependent recognizer, and 9.84% in the speaker-independent recognizer.

ER -