The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Wiener filtering(7hit)

1-7hit
  • Integration of Spatial Cue-Based Noise Reduction and Speech Model-Based Source Restoration for Real Time Speech Enhancement

    Tomoko KAWASE  Kenta NIWA  Masakiyo FUJIMOTO  Kazunori KOBAYASHI  Shoko ARAKI  Tomohiro NAKATANI  

     
    PAPER-Digital Signal Processing

      Vol:
    E100-A No:5
      Page(s):
    1127-1136

    We propose a microphone array speech enhancement method that integrates spatial-cue-based source power spectral density (PSD) estimation and statistical speech model-based PSD estimation. The goal of this research was to clearly pick up target speech even in noisy environments such as crowded places, factories, and cars running at high speed. Beamforming with post-Wiener filtering is commonly used in many conventional studies on microphone-array noise reduction. For calculating a Wiener filter, speech/noise PSDs are essential, and they are estimated using spatial cues obtained from microphone observations. Assuming that the sound sources are sparse in the temporal-spatial domain, speech/noise PSDs may be estimated accurately. However, PSD estimation errors increase under circumstances beyond this assumption. In this study, we integrated speech models and PSD-estimation-in-beamspace method to correct speech/noise PSD estimation errors. The roughly estimated noise PSD was obtained frame-by-frame by analyzing spatial cues from array observations. By combining noise PSD with the statistical model of clean-speech, the relationships between the PSD of the observed signal and that of the target speech, hereafter called the observation model, could be described without pre-training. By exploiting Bayes' theorem, a Wiener filter is statistically generated from observation models. Experiments conducted to evaluate the proposed method showed that the signal-to-noise ratio and naturalness of the output speech signal were significantly better than that with conventional methods.

  • Sub-Band Noise Reduction in Multi-Channel Digital Hearing Aid

    Qingyun WANG  Ruiyu LIANG  Li JING  Cairong ZOU  Li ZHAO  

     
    LETTER-Speech and Hearing

      Pubricized:
    2015/10/14
      Vol:
    E99-D No:1
      Page(s):
    292-295

    Since digital hearing aids are sensitive to time delay and power consumption, the computational complexity of noise reduction must be reduced as much as possible. Therefore, some complicated algorithms based on the analysis of the time-frequency domain are very difficult to implement in digital hearing aids. This paper presents a new approach that yields an improved noise reduction algorithm with greatly reduce computational complexity for multi-channel digital hearing aids. First, the sub-band sound pressure level (SPL) is calculated in real time. Then, based on the calculated sub-band SPL, the noise in the sub-band is estimated and the possibility of speech is computed. Finally, a posteriori and a priori signal-to-noise ratios are estimated and the gain function is acquired to reduce the noise adaptively. By replacing the FFT and IFFT transforms by the known SPL, the proposed algorithm greatly reduces the computation loads. Experiments on a prototype digital hearing aid show that the time delay is decreased to nearly half that of the traditional adaptive Wiener filtering and spectral subtraction algorithms, but the SNR improvement and PESQ score are rather satisfied. Compared with modulation frequency-based noise reduction algorithm, which is used in many commercial digital hearing aids, the proposed algorithm achieves not only more than 5dB SNR improvement but also less time delay and power consumption.

  • A Speech Enhancement Algorithm Based on Blind Signal Cancelation in Diffuse Noise Environments

    Jaesik HWANG  Jaepil SEO  Ji-Won CHO  Hyung-Min PARK  

     
    LETTER-Speech and Hearing

      Vol:
    E99-A No:1
      Page(s):
    407-411

    This letter describes a speech enhancement algorithm for stereo signals corrupted by diffuse noise. It estimates the noise signal and also a beamformed target signal based on blind target signal cancelation derived from sparsity minimization. Enhanced target speech is obtained by Wiener filtering using both the signals. Experimental results demonstrate the effectiveness of the proposed method.

  • Pitch-Synchronous Peak-Amplitude (PS-PA)-Based Feature Extraction Method for Noise-Robust ASR

    Muhammad GHULAM  Kouichi KATSURADA  Junsei HORIKAWA  Tsuneo NITTA  

     
    PAPER-Speech and Hearing

      Vol:
    E89-D No:11
      Page(s):
    2766-2774

    A novel pitch-synchronous auditory-based feature extraction method for robust automatic speech recognition (ASR) is proposed. A pitch-synchronous zero-crossing peak-amplitude (PS-ZCPA)-based feature extraction method was proposed previously and it showed improved performances except when modulation enhancement was integrated with Wiener filter (WF)-based noise reduction and auditory masking. However, since zero-crossing is not an auditory event, we propose a new pitch-synchronous peak-amplitude (PS-PA)-based method to render the feature extractor of ASR more auditory-like. We also examine the effects of WF-based noise reduction, modulation enhancement, and auditory masking in the proposed PS-PA method using the Aurora-2J database. The experimental results show superiority of the proposed method over the PS-ZCPA and other conventional methods. Furthermore, the problem due to the reconstruction of zero-crossings from a modulated envelope is eliminated. The experimental results also show the superiority of PS over PA in terms of the robustness of ASR, though PS and PA lead to significant improvement when applied together.

  • Image Authentication Based on Modular Embedding

    Moon Ho LEE  Valery KORZHIK  Guillermo MORALES-LUNA  Sergei LUSSE  Evgeny KURBATOV  

     
    PAPER-Application Information Security

      Vol:
    E89-D No:4
      Page(s):
    1498-1506

    We consider a watermark application to assist in the integrity maintenance and verification of the associated images. There is a great benefit in using WM in the context of authentication since it does not require any additional storage space for supplementary metadata, in contrast with cryptographic signatures, for instance. However there is a fundamental problem in the case of exact authentication: How to embed a signature into a cover message in such a way that it would be possible to restore the watermarked cover image into its original state without any error? There are different approaches to solve this problem. We use the watermarking method consisting of modulo addition of a mark and investigate it in detail. Our contribution lies in investigating different modified techniques of both watermark embedding and detection in order to provide the best reliability of watermark authentication. The simulation results for different types of embedders and detectors in combination with the pictures of watermarked images are given.

  • PS-ZCPA Based Feature Extraction with Auditory Masking, Modulation Enhancement and Noise Reduction for Robust ASR

    Muhammad GHULAM  Takashi FUKUDA  Kouichi KATSURADA  Junsei HORIKAWA  Tsuneo NITTA  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    1015-1023

    A pitch-synchronous (PS) auditory feature extraction method based on ZCPA (Zero-Crossings Peak-Amplitudes) was proposed previously and showed more robustness over a conventional ZCPA and MFCC based features. In this paper, firstly, a non-linear adaptive threshold adjustment procedure is introduced into the PS-ZCPA method to get optimal results in noisy conditions with different signal-to-noise ratio (SNR). Next, auditory masking, a well-known auditory perception, and modulation enhancement that simulates a strong relationship between modulation spectrums and intelligibility of speech are embedded into the PS-ZCPA method. Finally, a Wiener filter based noise reduction procedure is integrated into the method to make it more noise-robust, and the performance is evaluated against ETSI ES202 (WI008), which is a standard front-end for distributed speech recognition. All the experiments were carried out on Aurora-2J database. The experimental results demonstrated improved performance of the PS-ZCPA method by embedding auditory masking into it, and a slightly improved performance by using modulation enhancement. The PS-ZCPA method with Wiener filter based noise reduction also showed better performance than ETSI ES202 (WI008).

  • Comparative Study of Discrete Orthogonal Transforms in Adaptive Signal Processing

    Susanto RAHARDJA  Bogdan J. FALKOWSKI  

     
    PAPER

      Vol:
    E82-A No:8
      Page(s):
    1386-1390

    In this paper, comparison of various orthogonal transforms in Wiener filtering is discussed. The study involves the family of discrete orthogonal transforms called Complex Hadamard Transform, which has been recently introduced by the same authors. Basic definitions, properties and transformation kernel of Complex Hadamard Transform are also shown.