IEICE global.ieice.org Site

The search functionality is under construction.

The search functionality is under construction.

Author Search Result

[Author] Yuka KO(1hit)

1-1hit

Neural End-To-End Speech Translation Leveraged by ASR Posterior Distribution Open Access
Yuka KO Katsuhito SUDOH Sakriani SAKTI Satoshi NAKAMURA

PAPER-Speech and Hearing

Pubricized:
2024/05/24
Vol:
E107-D No:10
Page(s):
1322-1331
End-to-end speech translation (ST) directly renders source language speech to the target language without intermediate automatic speech recognition (ASR) output as in a cascade approach. End-to-end ST avoids error propagation from intermediate ASR results. Although recent attempts have applied multi-task learning using an auxiliary task of ASR to improve ST performance, they use cross-entropy loss to one-hot references in the ASR task, and the trained ST models do not consider possible ASR confusion. In this study, we propose a novel multi-task learning framework for end-to-end STs leveraged by ASR-based loss against posterior distributions obtained using a pre-trained ASR model called ASR posterior-based loss (ASR-PBL). The ASR-PBL method, which enables a ST model to reflect possible ASR confusion among competing hypotheses with similar pronunciations, can be applied to one of the strong multi-task ST baseline models with Hybrid CTC/Attention ASR task loss. In our experiments on the Fisher Spanish-to-English corpus, the proposed method demonstrated better BLEU results than the baseline that used standard CE loss.

Latest Issue

English

Links

Call for Papers

Call for Papers

Special Section

Submit to IEICE Trans.

Submit to IEICE Trans.

Information for Authors

Transactions NEWS

Transactions NEWS

Popular articles

Popular articles

Top 10 Downloads