The search functionality is under construction.
The search functionality is under construction.

Text-Independent Speaker Identification Using Gaussian Mixture Models Based on Multi-Space Probability Distribution

Chiyomi MIYAJIMA, Yosuke HATTORI, Keiichi TOKUDA, Takashi MASUKO, Takao KOBAYASHI, Tadashi KITAMURA

  • Full Text Views

    0

  • Cite this

Summary :

This paper presents a new approach to modeling speech spectra and pitch for text-independent speaker identification using Gaussian mixture models based on multi-space probability distribution (MSD-GMM). MSD-GMM allows us to model continuous pitch values of voiced frames and discrete symbols for unvoiced frames in a unified framework. Spectral and pitch features are jointly modeled by a two-stream MSD-GMM. We derive maximum likelihood (ML) estimation formulae and minimum classification error (MCE) training procedure for MSD-GMM parameters. The MSD-GMM speaker models are evaluated for text-independent speaker identification tasks. The experimental results show that the MSD-GMM can efficiently model spectral and pitch features of each speaker and outperforms conventional speaker models. The results also demonstrate the utility of the MCE training of the MSD-GMM parameters and the robustness for the inter-session variability.

Publication
IEICE TRANSACTIONS on Information Vol.E84-D No.7 pp.847-855
Publication Date
2001/07/01
Publicized
Online ISSN
DOI
Type of Manuscript
Special Section PAPER (Special Issue on Biometric Person Authentication)
Category

Authors

Keyword