IEICE global.ieice.org Site

Author Search Result

[Author] Takehiro IHARA(2hit)

1-2hit

Noise Reduction in Time Domain Using Referential Reconstruction
Takehiro IHARA Takayuki NAGAI Kazuhiko OZEKI Akira KUREMATSU

PAPER-Speech and Hearing

Vol:
E89-D No:3
Page(s):
1203-1213
We present a novel approach for single-channel noise reduction of speech signals contaminated by additive noise. In this approach, the system requires speech samples to be uttered in advance by the same speaker as that of the input signal. Speech samples used in this method must have enough phonetic variety to reconstruct the input signal. In the proposed method, which we refer to as referential reconstruction, we have used a small database created from examples of speech, which will be called reference signals. Referential reconstruction uses an example-based approach, in which the objective is to find the candidate speech frame which is the most similar to the clean input frame without noise, although the input frame is contaminated with noise. When candidate frames are found, they become final outputs without any special processing. In order to find the candidate frames, a correlation coefficient is used as a similarity measure. Through automatic speech recognition experiments, the proposed method was shown to be effective, particularly for low-SNR speech signals corrupted with white noise or noise in high-frequency bands. Since the direct implementation of this method requires infeasible computational cost for searching through reference signals, a coarse-to-fine strategy is introduced in this paper.
The Use of Overlapped Sub-Bands in Multi-Band, Multi-SNR, Multi-Path Recognition of Noisy Word Utterances
Yutaka TSUBOI Takehiro IHARA Kazuyuki TAKAGI Kazuhiko OZEKI

PAPER-Speech and Hearing

Vol:
E91-D No:6
Page(s):
1774-1782
A solution to the problem of improving robustness to noise in automatic speech recognition is presented in the framework of multi-band, multi-SNR, and multi-path approaches. In our word recognizer, the whole frequency band is divided into seven-overlapped sub-bands, and then sub-band noisy phoneme HMMs are trained on speech data mixed with the filtered white Gaussian noise at multiple SNRs. The acoustic model of a word is built as a set of concatenations of clean and noisy sub-band phoneme HMMs arranged in parallel. A Viterbi decoder allows a search path to transit to another SNR condition at a phoneme boundary. The recognition scores of the sub-bands are then recombined to give the score for a word. Experiments show that the overlapped seven-band system yields the best performance under nonstationary ambient noises. It is also shown that the use of filtered white Gaussian noise is advantageous for training noisy phoneme HMMs.

Author Search Result

[Author] Takehiro IHARA(2hit)

Noise Reduction in Time Domain Using Referential Reconstruction

The Use of Overlapped Sub-Bands in Multi-Band, Multi-SNR, Multi-Path Recognition of Noisy Word Utterances

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles