The search functionality is under construction.
The search functionality is under construction.

PS-ZCPA Based Feature Extraction with Auditory Masking, Modulation Enhancement and Noise Reduction for Robust ASR

Muhammad GHULAM, Takashi FUKUDA, Kouichi KATSURADA, Junsei HORIKAWA, Tsuneo NITTA

  • Full Text Views

    0

  • Cite this

Summary :

A pitch-synchronous (PS) auditory feature extraction method based on ZCPA (Zero-Crossings Peak-Amplitudes) was proposed previously and showed more robustness over a conventional ZCPA and MFCC based features. In this paper, firstly, a non-linear adaptive threshold adjustment procedure is introduced into the PS-ZCPA method to get optimal results in noisy conditions with different signal-to-noise ratio (SNR). Next, auditory masking, a well-known auditory perception, and modulation enhancement that simulates a strong relationship between modulation spectrums and intelligibility of speech are embedded into the PS-ZCPA method. Finally, a Wiener filter based noise reduction procedure is integrated into the method to make it more noise-robust, and the performance is evaluated against ETSI ES202 (WI008), which is a standard front-end for distributed speech recognition. All the experiments were carried out on Aurora-2J database. The experimental results demonstrated improved performance of the PS-ZCPA method by embedding auditory masking into it, and a slightly improved performance by using modulation enhancement. The PS-ZCPA method with Wiener filter based noise reduction also showed better performance than ETSI ES202 (WI008).

Publication
IEICE TRANSACTIONS on Information Vol.E89-D No.3 pp.1015-1023
Publication Date
2006/03/01
Publicized
Online ISSN
1745-1361
DOI
10.1093/ietisy/e89-d.3.1015
Type of Manuscript
Special Section PAPER (Special Section on Statistical Modeling for Speech Processing)
Category
Speech Recognition

Authors

Keyword