IEICE global.ieice.org Site

Keyword Search Result

[Keyword] mel-spectrogram(2hit)

1-2hit

Dual-Path Convolutional Neural Network Based on Band Interaction Block for Acoustic Scene Classification Open Access
Pengxu JIANG Yang YANG Yue XIE Cairong ZOU Qingyun WANG

LETTER-Engineering Acoustics

Pubricized:
2023/10/04
Vol:
E107-A No:7
Page(s):
1040-1044
Convolutional neural network (CNN) is widely used in acoustic scene classification (ASC) tasks. In most cases, local convolution is utilized to gather time-frequency information between spectrum nodes. It is challenging to adequately express the non-local link between frequency domains in a finite convolution region. In this paper, we propose a dual-path convolutional neural network based on band interaction block (DCNN-bi) for ASC, with mel-spectrogram as the model’s input. We build two parallel CNN paths to learn the high-frequency and low-frequency components of the input feature. Additionally, we have created three band interaction blocks (bi-blocks) to explore the pertinent nodes between various frequency bands, which are connected between two paths. Combining the time-frequency information from two paths, the bi-blocks with three distinct designs acquire non-local information and send it back to the respective paths. The experimental results indicate that the utilization of the bi-block has the potential to improve the initial performance of the CNN substantially. Specifically, when applied to the DCASE 2018 and DCASE 2020 datasets, the CNN exhibited performance improvements of 1.79% and 3.06%, respectively.
An Integrated Convolutional Neural Network with a Fusion Attention Mechanism for Acoustic Scene Classification
Pengxu JIANG Yue XIE Cairong ZOU Li ZHAO Qingyun WANG

LETTER-Engineering Acoustics

Pubricized:
2023/02/06
Vol:
E106-A No:8
Page(s):
1057-1061
In human-computer interaction, acoustic scene classification (ASC) is one of the relevant research domains. In real life, the recorded audio may include a lot of noise and quiet clips, making it hard for earlier ASC-based research to isolate the crucial scene information in sound. Furthermore, scene information may be scattered across numerous audio frames; hence, selecting scene-related frames is crucial for ASC. In this context, an integrated convolutional neural network with a fusion attention mechanism (ICNN-FA) is proposed for ASC. Firstly, segmented mel-spectrograms as the input of ICNN can assist the model in learning the short-term time-frequency correlation information. Then, the designed ICNN model is employed to learn these segment-level features. In addition, the proposed global attention layer may gather global information by integrating these segment features. Finally, the developed fusion attention layer is utilized to fuse all segment-level features while the classifier classifies various situations. Experimental findings using ASC datasets from DCASE 2018 and 2019 indicate the efficacy of the suggested method.

Keyword Search Result

[Keyword] mel-spectrogram(2hit)

Dual-Path Convolutional Neural Network Based on Band Interaction Block for Acoustic Scene Classification Open Access

An Integrated Convolutional Neural Network with a Fusion Attention Mechanism for Acoustic Scene Classification

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles