The search functionality is under construction.

Author Search Result

[Author] Chengyu YU(1hit)

1-1hit
  • Enhancing Speech Quality in Air Traffic Control Communication Using DIUnet_V-Based Speech Enhancement Techniques Open Access

    Haijun LIANG  Yukun LI  Jianguo KONG  Qicong HAN  Chengyu YU  

     
    PAPER-Speech and Hearing

      Pubricized:
    2023/12/11
      Vol:
    E107-D No:4
      Page(s):
    551-558

    Air Traffic Control (ATC) communication suffers from issues such as high electromagnetic interference, fast speech rate, and low intelligibility, which pose challenges for downstream tasks like Automatic Speech Recognition (ASR). This article aims to research how to enhance the audio quality and intelligibility of civil aviation speech through speech enhancement methods, thereby improving the accuracy of speech recognition and providing support for the digitalization of civil aviation. We propose a speech enhancement model called DIUnet_V (DenseNet & Inception & U-Net & Volume) that combines both time-frequency and time-domain methods to effectively handle the specific characteristics of civil aviation speech, such as predominant electromagnetic interference and fast speech rate. For model evaluation, we assess the denoising and enhancement effects using three metrics: Signal-to-Noise Ratio (SNR), Mean Opinion Score (MOS), and speech recognition error rate. On a simulated ATC training recording dataset, DIUnet_Volume10 achieved an SNR value of 7.3861, showing a 4.5663 improvement compared to the original U-net model. To address the challenge of the absence of clean speech in the ATC working environment, which makes it difficult to accurately calculate SNR, we propose evaluating the denoising effects indirectly based on the recognition performance of an ATC speech recognition system. On a real ATC speech dataset, the average word error rate decreased by 1.79% absolute and the average sentence error rate decreased by 3% absolute for DIUnet_V processed speech compared to the unprocessed speech in the built speech recognition system.