The search functionality is under construction.

Keyword Search Result

[Keyword] distributed speaker recognition(2hit)

1-2hit
  • Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification

    Mohamed Abdel FATTAH  Fuji REN  Shingo KUROIWA  

     
    PAPER-Speech and Hearing

      Vol:
    E89-D No:5
      Page(s):
    1712-1719

    In the European Telecommunication Standards Institute (ETSI), Distributed Speech Recognition (DSR) front-end, the distortion added due to feature compression on the front end side increases the variance flooring effect, which in turn increases the identification error rate. The penalty incurred in reducing the bit rate is the degradation in speaker recognition performance. In this paper, we present a nontraditional solution for the previously mentioned problem. To reduce the bit rate, a speech signal is segmented at the client, and the most effective phonemes (determined according to their type and frequency) for speaker recognition are selected and sent to the server. Speaker recognition occurs at the server. Applying this approach to YOHO corpus, we achieved an identification error rate (ER) of 0.05% using an average segment of 20.4% for a testing utterance in a speaker identification task. We also achieved an equal error rate (EER) of 0.42% using an average segment of 15.1% for a testing utterance in a speaker verification task.

  • Nonparametric Speaker Recognition Method Using Earth Mover's Distance

    Shingo KUROIWA  Yoshiyuki UMEDA  Satoru TSUGE  Fuji REN  

     
    PAPER-Speaker Recognition

      Vol:
    E89-D No:3
      Page(s):
    1074-1081

    In this paper, we propose a distributed speaker recognition method using a nonparametric speaker model and Earth Mover's Distance (EMD). In distributed speaker recognition, the quantized feature vectors are sent to a server. The Gaussian mixture model (GMM), the traditional method used for speaker recognition, is trained using the maximum likelihood approach. However, it is difficult to fit continuous density functions to quantized data. To overcome this problem, the proposed method represents each speaker model with a speaker-dependent VQ code histogram designed by registered feature vectors and directly calculates the distance between the histograms of speaker models and testing quantized feature vectors. To measure the distance between each speaker model and testing data, we use EMD which can calculate the distance between histograms with different bins. We conducted text-independent speaker identification experiments using the proposed method. Compared to results using the traditional GMM, the proposed method yielded relative error reductions of 32% for quantized data.