The search functionality is under construction.

IEICE TRANSACTIONS on Fundamentals

Speech Analysis Method Based on Source-Filter Model Using Multivariate Empirical Mode Decomposition

Surasak BOONKLA, Masashi UNOKI, Stanislav S. MAKHANOV, Chai WUTIWIWATCHAI

  • Full Text Views

    0

  • Cite this

Summary :

We propose a speech analysis method based on the source-filter model using multivariate empirical mode decomposition (MEMD). The proposed method takes multiple adjacent frames of a speech signal into account by combining their log spectra into multivariate signals. The multivariate signals are then decomposed into intrinsic mode functions (IMFs). The IMFs are divided into two groups using the peak of the autocorrelation function (ACF) of an IMF. The first group characterized by a spectral fine structure is used to estimate the fundamental frequency F0 by using the ACF, whereas the second group characterized by the frequency response of the vocal-tract filter is used to estimate formant frequencies by using a peak picking technique. There are two advantages of using MEMD: (i) the variation in the number of IMFs is eliminated in contrast with single-frame based empirical mode decomposition and (ii) the common information of the adjacent frames aligns in the same order of IMFs because of the common mode alignment property of MEMD. These advantages make the analysis more accurate than with other methods. As opposed to the conventional linear prediction (LP) and cepstrum methods, which rely on the LP order and cut-off frequency, respectively, the proposed method automatically separates the glottal-source and vocal-tract filter. The results showed that the proposed method exhibits the highest accuracy of F0 estimation and correctly estimates the formant frequencies of the vocal-tract filter.

Publication
IEICE TRANSACTIONS on Fundamentals Vol.E99-A No.10 pp.1762-1773
Publication Date
2016/10/01
Publicized
Online ISSN
1745-1337
DOI
10.1587/transfun.E99.A.1762
Type of Manuscript
PAPER
Category
Speech and Hearing

Authors

Surasak BOONKLA
  Japan Advanced Institute of Science and Technology,Thammasat University
Masashi UNOKI
  Japan Advanced Institute of Science and Technology
Stanislav S. MAKHANOV
  Thammasat University
Chai WUTIWIWATCHAI
  National Electronics and Computer Technology Center

Keyword