The search functionality is under construction.

IEICE TRANSACTIONS on Information

Speech Recognition for Air Traffic Control via Feature Learning and End-to-End Training

Peng FAN, Xiyao HUA, Yi LIN, Bo YANG, Jianwei ZHANG, Wenyi GE, Dongyue GUO

  • Full Text Views

    12

  • Cite this

Summary :

In this work, we propose a new automatic speech recognition (ASR) system based on feature learning and an end-to-end training procedure for air traffic control (ATC) systems. The proposed model integrates the feature learning block, recurrent neural network (RNN), and connectionist temporal classification loss to build an end-to-end ASR model. Facing the complex environments of ATC speech, instead of the handcrafted features, a learning block is designed to extract informative features from raw waveforms for acoustic modeling. Both the SincNet and 1D convolution blocks are applied to process the raw waveforms, whose outputs are concatenated to the RNN layers for the temporal modeling. Thanks to the ability to learn representations from raw waveforms, the proposed model can be optimized in a complete end-to-end manner, i.e., from waveform to text. Finally, the multilingual issue in the ATC domain is also considered to achieve the ASR task by constructing a combined vocabulary of Chinese characters and English letters. The proposed approach is validated on a multilingual real-world corpus (ATCSpeech), and the experimental results demonstrate that the proposed approach outperforms other baselines, achieving a 6.9% character error rate.

Publication
IEICE TRANSACTIONS on Information Vol.E106-D No.4 pp.538-544
Publication Date
2023/04/01
Publicized
2023/01/23
Online ISSN
1745-1361
DOI
10.1587/transinf.2022EDP7151
Type of Manuscript
PAPER
Category
Speech and Hearing

Authors

Peng FAN
  Sichuan University
Xiyao HUA
  Sichuan University
Yi LIN
  Sichuan University
Bo YANG
  Sichuan University
Jianwei ZHANG
  Sichuan University
Wenyi GE
  Chengdu University of Information Technology
Dongyue GUO
  Sichuan University

Keyword