The search functionality is under construction.
The search functionality is under construction.

Assessment of On-Line Model Quality and Threshold Estimation in Speaker Verification

Javier R. SAETA, Javier HERNANDO

  • Full Text Views

    0

  • Cite this

Summary :

The selection of the most representative utterances coming from a speaker is essential for the right performance of automatic enrollment in speaker verification. Model quality measures and threshold estimation methods mainly deal with the scarcity of data and the difficulty of obtaining data from impostors in real applications. Conventional methods estimate the quality of the training utterances once the model is created. In such case, it is not possible to ask the user for more utterances during the training session if necessary. A new training session must be started. That was especially unusable in applications where only one or two enrolment sessions were allowed. In this paper, a new on-line quality method based on a male and a female Universal Background Model (UBM) is introduced. The two models act as a reference for new utterances and show if they belong to the same speaker and provide a measure of its quality at the same time. On the other hand, the estimation of the verification threshold is also strongly influenced by the previous selection of the speaker's utterances. In this context, potential outliers, i.e., those client scores which are distant with regard to mean, could lead to wrong mean and variance client estimations. To alleviate this problem, some efficient threshold estimation methods based on removing or weighting scores are proposed here. Before estimating the threshold, the client scores catalogued as outliers are removed, pruned or weighted, improving subsequent estimations. Text-dependent experiments have been carried out by using a telephonic multi-session database in Spanish. The database has been recorded by the authors and has 184 speakers.

Publication
IEICE TRANSACTIONS on Information Vol.E90-D No.4 pp.759-765
Publication Date
2007/04/01
Publicized
Online ISSN
1745-1361
DOI
10.1093/ietisy/e90-d.4.759
Type of Manuscript
PAPER
Category
Speech and Hearing

Authors

Keyword