The search functionality is under construction.

Author Search Result

[Author] Zhaoyang GUO(3hit)

1-3hit
  • A Comprehensive Method to Improve Loudness Compensation and High-Frequency Speech Intelligibility for Digital Hearing Aids

    Zhaoyang GUO  Bo WANG  Xin'an WANG  

     
    LETTER-Speech and Hearing

      Vol:
    E100-A No:7
      Page(s):
    1552-1556

    A comprehensive method applying a nonlinear frequency compression (FC) as complementary to multi-band loudness compensation is proposed, which is able to improve loudness compensation and simultaneously increase high-frequency speech intelligibility for digital hearing aids. The proposed nonlinear FC (NLFC) improves the conventional methods in the aspect that the compression ratio (CR) is adjusted based on the speech intelligibility percentage in different frequency ranges. Then, an adaptive wide dynamic range compression (AWDRC) with a time-varying CR is applied to achieve adaptive loudness compensation. The experimental test results show that the mean speech identification is improved in comparison with the state-of-art methods.

  • An Improved Perceptual MBSS Noise Reduction with an SNR-Based VAD for a Fully Operational Digital Hearing Aid

    Zhaoyang GUO  Xin'an WANG  Bo WANG  Shanshan YONG  

     
    PAPER-Speech and Hearing

      Pubricized:
    2017/02/17
      Vol:
    E100-D No:5
      Page(s):
    1087-1096

    This paper first reviews the state-of-the-art noise reduction methods and points out their vulnerability in noise reduction performance and speech quality, especially under the low signal-noise ratios (SNR) environments. Then this paper presents an improved perceptual multiband spectral subtraction (MBSS) noise reduction algorithm (NRA) and a novel robust voice activity detection (VAD) based on the amended sub-band SNR. The proposed SNR-based VAD can considerably increase the accuracy of discrimination between noise and speech frame. The simulation results show that the proposed NRA has better segmental SNR (segSNR) and perceptual evaluation of speech quality (PESQ) performance than other noise reduction algorithms especially under low SNR environments. In addition, a fully operational digital hearing aid chip is designed and fabricated in the 0.13 µm CMOS process based on the proposed NRA. The final chip implementation shows that the whole chip dissipates 1.3 mA at the 1.2 V operation. The acoustic test result shows that the maximum output sound pressure level (OSPL) is 114.6 dB SPL, the equivalent input noise is 5.9 dB SPL, and the total harmonic distortion is 2.5%. So the proposed digital hearing aid chip is a promising candidate for high performance hearing-aid systems.

  • A Novel 3D Gradient LBP Descriptor for Action Recognition

    Zhaoyang GUO  Xin'an WANG  Bo WANG  Zheng XIE  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2017/03/02
      Vol:
    E100-D No:6
      Page(s):
    1388-1392

    In the field of action recognition, Spatio-Temporal Interest Points (STIPs)-based features have shown high efficiency and robustness. However, most of state-of-the-art work to describe STIPs, they typically focus on 2-dimensions (2D) images, which ignore information in 3D spatio-temporal space. Besides, the compact representation of descriptors should be considered due to the costs of storage and computational time. In this paper, a novel local descriptor named 3D Gradient LBP is proposed, which extends the traditional descriptor Local Binary Patterns (LBP) into 3D spatio-temporal space. The proposed descriptor takes advantage of the neighbourhood information of cuboids in three dimensions, which accounts for its excellent descriptive power for the distribution of grey-level space. Experiments on three challenging datasets (KTH, Weizmann and UT Interaction) validate the effectiveness of our approach in the recognition of human actions.