The search functionality is under construction.
The search functionality is under construction.

Inference Discrepancy Based Curriculum Learning for Neural Machine Translation

Lei ZHOU, Ryohei SASANO, Koichi TAKEDA

  • Full Text Views

    0

  • Cite this

Summary :

In practice, even a well-trained neural machine translation (NMT) model can still make biased inferences on the training set due to distribution shifts. For the human learning process, if we can not reproduce something correctly after learning it multiple times, we consider it to be more difficult. Likewise, a training example causing a large discrepancy between inference and reference implies higher learning difficulty for the MT model. Therefore, we propose to adopt the inference discrepancy of each training example as the difficulty criterion, and according to which rank training examples from easy to hard. In this way, a trained model can guide the curriculum learning process of an initial model identical to itself. We put forward an analogy to this training scheme as guiding the learning process of a curriculum NMT model by a pretrained vanilla model. In this paper, we assess the effectiveness of the proposed training scheme and take an insight into the influence of translation direction, evaluation metrics and different curriculum schedules. Experimental results on translation benchmarks WMT14 English ⇒ German, WMT17 Chinese ⇒ English and Multitarget TED Talks Task (MTTT) English ⇔ German, English ⇔ Chinese, English ⇔ Russian demonstrate that our proposed method consistently improves the translation performance against the advanced Transformer baseline.

Publication
IEICE TRANSACTIONS on Information Vol.E107-D No.1 pp.135-143
Publication Date
2024/01/01
Publicized
2023/10/18
Online ISSN
1745-1361
DOI
10.1587/transinf.2023EDP7048
Type of Manuscript
PAPER
Category
Natural Language Processing

Authors

Lei ZHOU
  Nagoya University
Ryohei SASANO
  Nagoya University
Koichi TAKEDA
  Nagoya University

Keyword