The search functionality is under construction.
The search functionality is under construction.

Orthogonal Gradient Penalty for Fast Training of Wasserstein GAN Based Multi-Task Autoencoder toward Robust Speech Recognition

Chao-Yuan KAO, Sangwook PARK, Alzahra BADI, David K. HAN, Hanseok KO

  • Full Text Views

    0

  • Cite this

Summary :

Performance in Automatic Speech Recognition (ASR) degrades dramatically in noisy environments. To alleviate this problem, a variety of deep networks based on convolutional neural networks and recurrent neural networks were proposed by applying L1 or L2 loss. In this Letter, we propose a new orthogonal gradient penalty (OGP) method for Wasserstein Generative Adversarial Networks (WGAN) applied to denoising and despeeching models. WGAN integrates a multi-task autoencoder which estimates not only speech features but also noise features from noisy speech. While achieving 14.1% improvement in Wasserstein distance convergence rate, the proposed OGP enhanced features are tested in ASR and achieve 9.7%, 8.6%, 6.2%, and 4.8% WER improvements over DDAE, MTAE, R-CED(CNN) and RNN models.

Publication
IEICE TRANSACTIONS on Information Vol.E103-D No.5 pp.1195-1198
Publication Date
2020/05/01
Publicized
2020/01/27
Online ISSN
1745-1361
DOI
10.1587/transinf.2019EDL8183
Type of Manuscript
LETTER
Category
Speech and Hearing

Authors

Chao-Yuan KAO
  Korea University
Sangwook PARK
  Johns Hopkins University
Alzahra BADI
  Korea University
David K. HAN
  US Army Research Laboratory (ARL)
Hanseok KO
  Korea University

Keyword