The search functionality is under construction.

IEICE TRANSACTIONS on Fundamentals

An Integrated Convolutional Neural Network with a Fusion Attention Mechanism for Acoustic Scene Classification

Pengxu JIANG, Yue XIE, Cairong ZOU, Li ZHAO, Qingyun WANG

  • Full Text Views

    24

  • Cite this

Summary :

In human-computer interaction, acoustic scene classification (ASC) is one of the relevant research domains. In real life, the recorded audio may include a lot of noise and quiet clips, making it hard for earlier ASC-based research to isolate the crucial scene information in sound. Furthermore, scene information may be scattered across numerous audio frames; hence, selecting scene-related frames is crucial for ASC. In this context, an integrated convolutional neural network with a fusion attention mechanism (ICNN-FA) is proposed for ASC. Firstly, segmented mel-spectrograms as the input of ICNN can assist the model in learning the short-term time-frequency correlation information. Then, the designed ICNN model is employed to learn these segment-level features. In addition, the proposed global attention layer may gather global information by integrating these segment features. Finally, the developed fusion attention layer is utilized to fuse all segment-level features while the classifier classifies various situations. Experimental findings using ASC datasets from DCASE 2018 and 2019 indicate the efficacy of the suggested method.

Publication
IEICE TRANSACTIONS on Fundamentals Vol.E106-A No.8 pp.1057-1061
Publication Date
2023/08/01
Publicized
2023/02/06
Online ISSN
1745-1337
DOI
10.1587/transfun.2022EAL2091
Type of Manuscript
LETTER
Category
Engineering Acoustics

Authors

Pengxu JIANG
  Southeast University
Yue XIE
  Southeast University
Cairong ZOU
  Southeast University
Li ZHAO
  Southeast University
Qingyun WANG
  Nanjing Institute of Technology

Keyword