The search functionality is under construction.

Author Search Result

[Author] Qingyun WANG(7hit)

1-7hit
  • An Integrated Convolutional Neural Network with a Fusion Attention Mechanism for Acoustic Scene Classification

    Pengxu JIANG  Yue XIE  Cairong ZOU  Li ZHAO  Qingyun WANG  

     
    LETTER-Engineering Acoustics

      Pubricized:
    2023/02/06
      Vol:
    E106-A No:8
      Page(s):
    1057-1061

    In human-computer interaction, acoustic scene classification (ASC) is one of the relevant research domains. In real life, the recorded audio may include a lot of noise and quiet clips, making it hard for earlier ASC-based research to isolate the crucial scene information in sound. Furthermore, scene information may be scattered across numerous audio frames; hence, selecting scene-related frames is crucial for ASC. In this context, an integrated convolutional neural network with a fusion attention mechanism (ICNN-FA) is proposed for ASC. Firstly, segmented mel-spectrograms as the input of ICNN can assist the model in learning the short-term time-frequency correlation information. Then, the designed ICNN model is employed to learn these segment-level features. In addition, the proposed global attention layer may gather global information by integrating these segment features. Finally, the developed fusion attention layer is utilized to fuse all segment-level features while the classifier classifies various situations. Experimental findings using ASC datasets from DCASE 2018 and 2019 indicate the efficacy of the suggested method.

  • Dual-Path Convolutional Neural Network Based on Band Interaction Block for Acoustic Scene Classification Open Access

    Pengxu JIANG  Yang YANG  Yue XIE  Cairong ZOU  Qingyun WANG  

     
    LETTER-Engineering Acoustics

      Pubricized:
    2023/10/04
      Vol:
    E107-A No:7
      Page(s):
    1040-1044

    Convolutional neural network (CNN) is widely used in acoustic scene classification (ASC) tasks. In most cases, local convolution is utilized to gather time-frequency information between spectrum nodes. It is challenging to adequately express the non-local link between frequency domains in a finite convolution region. In this paper, we propose a dual-path convolutional neural network based on band interaction block (DCNN-bi) for ASC, with mel-spectrogram as the model’s input. We build two parallel CNN paths to learn the high-frequency and low-frequency components of the input feature. Additionally, we have created three band interaction blocks (bi-blocks) to explore the pertinent nodes between various frequency bands, which are connected between two paths. Combining the time-frequency information from two paths, the bi-blocks with three distinct designs acquire non-local information and send it back to the respective paths. The experimental results indicate that the utilization of the bi-block has the potential to improve the initial performance of the CNN substantially. Specifically, when applied to the DCASE 2018 and DCASE 2020 datasets, the CNN exhibited performance improvements of 1.79% and 3.06%, respectively.

  • An Effective Acoustic Feedback Cancellation Algorithm Based on the Normalized Sub-Band Adaptive Filter

    Xia WANG  Ruiyu LIANG  Qingyun WANG  Li ZHAO  Cairong ZOU  

     
    LETTER-Speech and Hearing

      Pubricized:
    2015/10/20
      Vol:
    E99-D No:1
      Page(s):
    288-291

    In this letter, an effective acoustic feedback cancellation algorithm is proposed based on the normalized sub-band adaptive filter (NSAF). To improve the confliction between fast convergence rate and low misalignment in the NSAF algorithm, a variable step size is designed to automatically vary according to the update state of the filter. The update state of the filter is adaptively detected via the normalized distance between the long term average and the short term average of the tap-weight vector. Simulation results demonstrate that the proposed algorithm has superior performance in terms of convergence rate and misalignment.

  • Compressed Sampling and Source Localization of Miniature Microphone Array

    Qingyun WANG  Xinchun JI  Ruiyu LIANG  Li ZHAO  

     
    LETTER

      Vol:
    E97-A No:9
      Page(s):
    1902-1906

    In the traditional microphone array signal processing, the performance degrades rapidly when the array aperture decreases, which has been a barrier restricting its implementation in the small-scale acoustic system such as digital hearing aids. In this work a new compressed sampling method of miniature microphone array is proposed, which compresses information in the internal of ADC by means of mixture system of hardware circuit and software program in order to remove the redundancy of the different array element signals. The architecture of the method is developed using the Verilog language and has already been tested in the FPGA chip. Experiments of compressed sampling and reconstruction show the successful sparseness and reconstruction for speech sources. Owing to having avoided singularity problem of the correlation matrix of the miniature microphone array, when used in the direction of arrival (DOA) estimation in digital hearing aids, the proposed method has the advantage of higher resolution compared with the traditional GCC and MUSIC algorithms.

  • Sub-Band Noise Reduction in Multi-Channel Digital Hearing Aid

    Qingyun WANG  Ruiyu LIANG  Li JING  Cairong ZOU  Li ZHAO  

     
    LETTER-Speech and Hearing

      Pubricized:
    2015/10/14
      Vol:
    E99-D No:1
      Page(s):
    292-295

    Since digital hearing aids are sensitive to time delay and power consumption, the computational complexity of noise reduction must be reduced as much as possible. Therefore, some complicated algorithms based on the analysis of the time-frequency domain are very difficult to implement in digital hearing aids. This paper presents a new approach that yields an improved noise reduction algorithm with greatly reduce computational complexity for multi-channel digital hearing aids. First, the sub-band sound pressure level (SPL) is calculated in real time. Then, based on the calculated sub-band SPL, the noise in the sub-band is estimated and the possibility of speech is computed. Finally, a posteriori and a priori signal-to-noise ratios are estimated and the gain function is acquired to reduce the noise adaptively. By replacing the FFT and IFFT transforms by the known SPL, the proposed algorithm greatly reduces the computation loads. Experiments on a prototype digital hearing aid show that the time delay is decreased to nearly half that of the traditional adaptive Wiener filtering and spectral subtraction algorithms, but the SNR improvement and PESQ score are rather satisfied. Compared with modulation frequency-based noise reduction algorithm, which is used in many commercial digital hearing aids, the proposed algorithm achieves not only more than 5dB SNR improvement but also less time delay and power consumption.

  • A Salient Feature Extraction Algorithm for Speech Emotion Recognition

    Ruiyu LIANG  Huawei TAO  Guichen TANG  Qingyun WANG  Li ZHAO  

     
    LETTER-Speech and Hearing

      Pubricized:
    2015/05/29
      Vol:
    E98-D No:9
      Page(s):
    1715-1718

    A salient feature extraction algorithm is proposed to improve the recognition rate of the speech emotion. Firstly, the spectrogram of the emotional speech is calculated. Secondly, imitating the selective attention mechanism, the color, direction and brightness map of the spectrogram is computed. Each map is normalized and down-sampled to form the low resolution feature matrix. Then, each feature matrix is converted to the row vector and the principal component analysis (PCA) is used to reduce features redundancy to make the subsequent classification algorithm more practical. Finally, the speech emotion is classified with the support vector machine. Compared with the tradition features, the improved recognition rate reaches 15%.

  • An Iterative Technique for Optimally Designing Extrapolated Impulse Response Filter in the Mini-Max Sense

    Hao WANG  Li ZHAO  Wenjiang PEI  Jiakuo ZUO  Qingyun WANG  Minghai XIN  

     
    LETTER-Systems and Control

      Vol:
    E96-A No:10
      Page(s):
    2029-2033

    The optimal design of an extrapolated impulse response (EIR) filter (in the mini-max sense) is a non-linear programming problem. In this paper, the optimal design of the EIR filter by the semi-infinite programming (SIP) is investigated and an iterative technique for optimally designing the EIR filter is proposed. The simulation experiment validates the effectiveness of the SIP technique and the proposed iterative technique in the optimal design of the EIR filter.