Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification

Mohamed Abdel FATTAH; Fuji REN; Shingo KUROIWA

doi:10.1093/ietisy/e89-d.5.1712

IEICE TRANSACTIONS on Information

Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification

Mohamed Abdel FATTAH, Fuji REN, Shingo KUROIWA

Full Text Views

0

Cite this

Summary :

In the European Telecommunication Standards Institute (ETSI), Distributed Speech Recognition (DSR) front-end, the distortion added due to feature compression on the front end side increases the variance flooring effect, which in turn increases the identification error rate. The penalty incurred in reducing the bit rate is the degradation in speaker recognition performance. In this paper, we present a nontraditional solution for the previously mentioned problem. To reduce the bit rate, a speech signal is segmented at the client, and the most effective phonemes (determined according to their type and frequency) for speaker recognition are selected and sent to the server. Speaker recognition occurs at the server. Applying this approach to YOHO corpus, we achieved an identification error rate (ER) of 0.05% using an average segment of 20.4% for a testing utterance in a speaker identification task. We also achieved an equal error rate (EER) of 0.42% using an average segment of 15.1% for a testing utterance in a speaker verification task.

Publication: IEICE TRANSACTIONS on Information Vol.E89-D No.5 pp.1712-1719

Publication Date: 2006/05/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e89-d.5.1712

Type of Manuscript: PAPER

Category: Speech and Hearing

Cite this

Copy

Mohamed Abdel FATTAH, Fuji REN, Shingo KUROIWA, "Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification" in IEICE TRANSACTIONS on Information, vol. E89-D, no. 5, pp. 1712-1719, May 2006, doi: 10.1093/ietisy/e89-d.5.1712.
Abstract: In the European Telecommunication Standards Institute (ETSI), Distributed Speech Recognition (DSR) front-end, the distortion added due to feature compression on the front end side increases the variance flooring effect, which in turn increases the identification error rate. The penalty incurred in reducing the bit rate is the degradation in speaker recognition performance. In this paper, we present a nontraditional solution for the previously mentioned problem. To reduce the bit rate, a speech signal is segmented at the client, and the most effective phonemes (determined according to their type and frequency) for speaker recognition are selected and sent to the server. Speaker recognition occurs at the server. Applying this approach to YOHO corpus, we achieved an identification error rate (ER) of 0.05% using an average segment of 20.4% for a testing utterance in a speaker identification task. We also achieved an equal error rate (EER) of 0.42% using an average segment of 15.1% for a testing utterance in a speaker verification task.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e89-d.5.1712/_p

Copy

@ARTICLE{e89-d_5_1712,
author={Mohamed Abdel FATTAH, Fuji REN, Shingo KUROIWA, },
journal={IEICE TRANSACTIONS on Information},
title={Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification},
year={2006},
volume={E89-D},
number={5},
pages={1712-1719},
abstract={In the European Telecommunication Standards Institute (ETSI), Distributed Speech Recognition (DSR) front-end, the distortion added due to feature compression on the front end side increases the variance flooring effect, which in turn increases the identification error rate. The penalty incurred in reducing the bit rate is the degradation in speaker recognition performance. In this paper, we present a nontraditional solution for the previously mentioned problem. To reduce the bit rate, a speech signal is segmented at the client, and the most effective phonemes (determined according to their type and frequency) for speaker recognition are selected and sent to the server. Speaker recognition occurs at the server. Applying this approach to YOHO corpus, we achieved an identification error rate (ER) of 0.05% using an average segment of 20.4% for a testing utterance in a speaker identification task. We also achieved an equal error rate (EER) of 0.42% using an average segment of 15.1% for a testing utterance in a speaker verification task.},
keywords={},
doi={10.1093/ietisy/e89-d.5.1712},
ISSN={1745-1361},
month={May},}

Copy

TY - JOUR
TI - Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification
T2 - IEICE TRANSACTIONS on Information
SP - 1712
EP - 1719
AU - Mohamed Abdel FATTAH
AU - Fuji REN
AU - Shingo KUROIWA
PY - 2006
DO - 10.1093/ietisy/e89-d.5.1712
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E89-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2006
AB - In the European Telecommunication Standards Institute (ETSI), Distributed Speech Recognition (DSR) front-end, the distortion added due to feature compression on the front end side increases the variance flooring effect, which in turn increases the identification error rate. The penalty incurred in reducing the bit rate is the degradation in speaker recognition performance. In this paper, we present a nontraditional solution for the previously mentioned problem. To reduce the bit rate, a speech signal is segmented at the client, and the most effective phonemes (determined according to their type and frequency) for speaker recognition are selected and sent to the server. Speaker recognition occurs at the server. Applying this approach to YOHO corpus, we achieved an identification error rate (ER) of 0.05% using an average segment of 20.4% for a testing utterance in a speaker identification task. We also achieved an equal error rate (EER) of 0.42% using an average segment of 15.1% for a testing utterance in a speaker verification task.
ER -

IEICE TRANSACTIONS on Information

Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles