This paper proposes a new speaker adaptation method using a speaker weighting technique for multiple reference speaker training of a hidden Markov model (HMM). The proposed method considers the similarities between an input speaker and multiple reference speakers, and use the similarities to control the influence of the reference speakers upon HMM. The evaluation experiments were carried out through the/b, d, g, m, n, N/phoneme recognition task using 8 speakers. Average recognition rates were 68.0%, 66.4%, and 65.6% respectively for three test sets which have different speech styles. These were 4.8%, 8.8%, and 10.5% higher than the rates of the spectrum mapping method, and also 1.6%, 6.7%, and 8.2% higher than the rates of the multiple reference speaker training, the supplemented HMM. The evaluation experiments clarified the effectiveness of the proposed method.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Hiroaki HATTORI, Satoshi NAKAMURA, Kiyohiro SHIKANO, Shigeki SAGAYAMA, "Speaker Weighted Training of HMM Using Multiple Reference Speakers" in IEICE TRANSACTIONS on Information,
vol. E76-D, no. 2, pp. 219-226, February 1993, doi: .
Abstract: This paper proposes a new speaker adaptation method using a speaker weighting technique for multiple reference speaker training of a hidden Markov model (HMM). The proposed method considers the similarities between an input speaker and multiple reference speakers, and use the similarities to control the influence of the reference speakers upon HMM. The evaluation experiments were carried out through the/b, d, g, m, n, N/phoneme recognition task using 8 speakers. Average recognition rates were 68.0%, 66.4%, and 65.6% respectively for three test sets which have different speech styles. These were 4.8%, 8.8%, and 10.5% higher than the rates of the spectrum mapping method, and also 1.6%, 6.7%, and 8.2% higher than the rates of the multiple reference speaker training, the supplemented HMM. The evaluation experiments clarified the effectiveness of the proposed method.
URL: https://global.ieice.org/en_transactions/information/10.1587/e76-d_2_219/_p
Copy
@ARTICLE{e76-d_2_219,
author={Hiroaki HATTORI, Satoshi NAKAMURA, Kiyohiro SHIKANO, Shigeki SAGAYAMA, },
journal={IEICE TRANSACTIONS on Information},
title={Speaker Weighted Training of HMM Using Multiple Reference Speakers},
year={1993},
volume={E76-D},
number={2},
pages={219-226},
abstract={This paper proposes a new speaker adaptation method using a speaker weighting technique for multiple reference speaker training of a hidden Markov model (HMM). The proposed method considers the similarities between an input speaker and multiple reference speakers, and use the similarities to control the influence of the reference speakers upon HMM. The evaluation experiments were carried out through the/b, d, g, m, n, N/phoneme recognition task using 8 speakers. Average recognition rates were 68.0%, 66.4%, and 65.6% respectively for three test sets which have different speech styles. These were 4.8%, 8.8%, and 10.5% higher than the rates of the spectrum mapping method, and also 1.6%, 6.7%, and 8.2% higher than the rates of the multiple reference speaker training, the supplemented HMM. The evaluation experiments clarified the effectiveness of the proposed method.},
keywords={},
doi={},
ISSN={},
month={February},}
Copy
TY - JOUR
TI - Speaker Weighted Training of HMM Using Multiple Reference Speakers
T2 - IEICE TRANSACTIONS on Information
SP - 219
EP - 226
AU - Hiroaki HATTORI
AU - Satoshi NAKAMURA
AU - Kiyohiro SHIKANO
AU - Shigeki SAGAYAMA
PY - 1993
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E76-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 1993
AB - This paper proposes a new speaker adaptation method using a speaker weighting technique for multiple reference speaker training of a hidden Markov model (HMM). The proposed method considers the similarities between an input speaker and multiple reference speakers, and use the similarities to control the influence of the reference speakers upon HMM. The evaluation experiments were carried out through the/b, d, g, m, n, N/phoneme recognition task using 8 speakers. Average recognition rates were 68.0%, 66.4%, and 65.6% respectively for three test sets which have different speech styles. These were 4.8%, 8.8%, and 10.5% higher than the rates of the spectrum mapping method, and also 1.6%, 6.7%, and 8.2% higher than the rates of the multiple reference speaker training, the supplemented HMM. The evaluation experiments clarified the effectiveness of the proposed method.
ER -