1-3hit |
Yuuji MUKAI Hideki NODA Takashi OSANAI
This paper discusses speaker verification (SV) using Gaussian mixture models (GMMs), where only utterances of enrolled speakers are required. Such an SV system can be realized using artificially generated cohorts instead of real cohorts from speaker databases. This paper presents a rational approach to set GMM parameters for artificial cohorts based on statistics of GMM parameters for real cohorts. Equal error rates for the proposed method are about 10% less than those for the previous method, where GMM parameters for artificial cohorts were set in an ad hoc manner.
Toshiaki KAMADA Nobuaki MINEMATSU Takashi OSANAI Hisanori MAKINAE Masumi TANIMOTO
In forensic voice telephony speaker verification, we may be requested to identify a speaker in a very noisy environment, unlike the conditions in general research. In a noisy environment, we process speech first by clarifying it. However, the previous study of speaker verification from clarified speech did not yield satisfactory results. In this study, we experimented on speaker verification with clarification of speech in a noisy environment, and we examined the relationship between improving acoustic quality and speaker verification results. Moreover, experiments with realistic noise such as a crime prevention alarm and power supply noise was conducted, and speaker verification accuracy in a realistic environment was examined. We confirmed the validity of speaker verification with clarification of speech in a realistic noisy environment.
Yuuji MUKAI Hideki NODA Michiharu NIIMI Takashi OSANAI
This paper presents a text-independent speaker verification method using Gaussian mixture models (GMMs), where only utterances of enrolled speakers are required. Artificial cohorts are used instead of those from speaker databases, and GMMs for artificial cohorts are generated by changing model parameters of the GMM for a claimed speaker. Equal error rates by the proposed method are about 60% less than those by a conventional method which also uses only utterances of enrolled speakers.