Copy
Kouichi YAMAGUCHI, Harald SINGER, Shoichi MATSUNAGA, Shigeki SAGAYAMA, "Speaker-Consistent Parsing for Speaker-Independent Continuous Speech Recognition" in IEICE TRANSACTIONS on Information,
vol. E78-D, no. 6, pp. 719-724, June 1995, doi: .
Abstract: This paper describes a novel speaker-independent speech recognition method, called speaker-consistent parsing", which is based on an intra-speaker correlation called the speaker-consistency principle. We focus on the fact that a sentence or a string of words is uttered by an individual speaker even in a speaker-independent task. Thus, the proposed method searches through speaker variations in addition to the contents of utterances. As a result of the recognition process, an appropriate standard speaker is selected for speaker adaptation. This new method is experimentally compared with a conventional speaker-independent speech recognition method. Since the speaker-consistency principle best demonstrates its effect with a large number of training and test speakers, a small-scale experiment may not fully exploit this principle. Nevertheless, even the results of our small-scale experiment show that the new method significantly outperforms the conventional method. In addition, this framework's speaker selection mechanism can drastically reduce the likelihood map computation.
URL: https://global.ieice.org/en_transactions/information/10.1587/e78-d_6_719/_p
Copy
@ARTICLE{e78-d_6_719,
author={Kouichi YAMAGUCHI, Harald SINGER, Shoichi MATSUNAGA, Shigeki SAGAYAMA, },
journal={IEICE TRANSACTIONS on Information},
title={Speaker-Consistent Parsing for Speaker-Independent Continuous Speech Recognition},
year={1995},
volume={E78-D},
number={6},
pages={719-724},
abstract={This paper describes a novel speaker-independent speech recognition method, called speaker-consistent parsing", which is based on an intra-speaker correlation called the speaker-consistency principle. We focus on the fact that a sentence or a string of words is uttered by an individual speaker even in a speaker-independent task. Thus, the proposed method searches through speaker variations in addition to the contents of utterances. As a result of the recognition process, an appropriate standard speaker is selected for speaker adaptation. This new method is experimentally compared with a conventional speaker-independent speech recognition method. Since the speaker-consistency principle best demonstrates its effect with a large number of training and test speakers, a small-scale experiment may not fully exploit this principle. Nevertheless, even the results of our small-scale experiment show that the new method significantly outperforms the conventional method. In addition, this framework's speaker selection mechanism can drastically reduce the likelihood map computation.},
keywords={},
doi={},
ISSN={},
month={June},}
Copy
TY - JOUR
TI - Speaker-Consistent Parsing for Speaker-Independent Continuous Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 719
EP - 724
AU - Kouichi YAMAGUCHI
AU - Harald SINGER
AU - Shoichi MATSUNAGA
AU - Shigeki SAGAYAMA
PY - 1995
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E78-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 1995
AB - This paper describes a novel speaker-independent speech recognition method, called speaker-consistent parsing", which is based on an intra-speaker correlation called the speaker-consistency principle. We focus on the fact that a sentence or a string of words is uttered by an individual speaker even in a speaker-independent task. Thus, the proposed method searches through speaker variations in addition to the contents of utterances. As a result of the recognition process, an appropriate standard speaker is selected for speaker adaptation. This new method is experimentally compared with a conventional speaker-independent speech recognition method. Since the speaker-consistency principle best demonstrates its effect with a large number of training and test speakers, a small-scale experiment may not fully exploit this principle. Nevertheless, even the results of our small-scale experiment show that the new method significantly outperforms the conventional method. In addition, this framework's speaker selection mechanism can drastically reduce the likelihood map computation.
ER -