1-4hit |
Linear Prediction (LP) analysis is commonly used in speech processing. LP is based on Auto-Regressive (AR) model and it estimates the AR model parameter from signals with l2-norm optimization. Recently, sparse estimation is paid attention since it can extract significant features from big data. The sparse estimation is realized by l1 or l0-norm optimization or regularization. Sparse LP analysis methods based on l1-norm optimization have been proposed. Since excitation of speech is not white Gaussian, a sparse LP estimation can estimate more accurate parameter than the conventional l2-norm based LP. These are time-invariant and real-valued analysis. We have been studied Time-Varying Complex AR (TV-CAR) analysis for an analytic signal and have evaluated the performance on speech processing. The TV-CAR methods are l2-norm methods. In this paper, we propose the sparse TV-CAR analysis based on adaptive LASSO (Least absolute shrinkage and selection operator) that is l1-norm regularization and evaluate the performance on F0 estimation of speech using IRAPT (Instantaneous RAPT). The experimental results show that the sparse TV-CAR methods perform better for a high level of additive Pink noise.
This paper proposes novel robust speech F0 estimation using Summation Residual Harmonics (SRH) based on TV-CAR (Time-Varying Complex AR) analysis. The SRH-based F0 estimation was proposed by A. Alwan, in which the criterion is calculated from LP residual signals. The criterion is summation of residual spectrum value for harmonics. In this paper, we propose SRH-based F0 estimation based on the TV-CAR analysis, in which the criterion is calculated from the complex AR residual. Since complex AR residual provides higher resolution of spectrum, it can be considered that the criterion is effective for F0 estimation. The experimental results demonstrate that the proposed method performs better than conventional methods; weighted auto-correlation and YIN.
Keiichi FUNAKI Tatsuhiko KINJO
Complex speech analysis for an analytic speech signal can accurately estimate the spectrum in low frequencies since the analytic signal provides spectrum only over positive frequencies. The remarkable feature makes it possible to realize more accurate F0 estimation using complex residual signal extracted by complex-valued speech analysis. We have already proposed F0 estimation using complex LPC residual, in which the autocorrelation function weighted by AMDF was adopted as the criterion. The method adopted MMSE-based complex LPC analysis and it has been reported that it can estimate more accurate F0 for IRS filtered speech corrupted by white Gauss noise although it can not work better for the IRS filtered speech corrupted by pink noise. In this paper, robust complex speech analysis based on ELS (Extended Least Square) method is introduced in order to overcome the drawback. The experimental results for additive white Gauss or pink noise demonstrate that the proposed algorithm based on robust ELS-based complex AR analysis can perform better than other methods.
Keiichi FUNAKI Tatsuhiko KINJO
This paper proposes a novel robust fundamental frequency (F0) estimation algorithm based on complex-valued speech analysis for an analytic speech signal. Since analytic signal provides spectra only over positive frequencies, spectra can be accurately estimated in low frequencies. Consequently, it is considered that F0 estimation using the residual signal extracted by complex-valued speech analysis can perform better for F0 estimation than that for the residual signal extracted by conventional real-valued LPC analysis. In this paper, the autocorrelation function weighted by AMDF is adopted for the F0 estimation criterion and four signals; speech signal, analytic speech signal, LPC residual and complex LPC residual, are evaluated for the F0 estimation. Speech signals used in the experiments were an IRS filtered speech corrupted by adding white Gaussian noise or Pink noise whose noise levels are 10, 5, 0, -5 [dB]. The experimental results demonstrate that the proposed algorithm based on complex LPC residual can perform better than other methods in noisy environment.