Since a concept can be represented by different vocabularies, styles, and levels of detail, a translation task resembles a many-to-many mapping task from a distribution of sentences in the source language into a distribution of sentences in the target language. This viewpoint, however, is not fully implemented in current neural machine translation (NMT), which is one-to-one sentence mapping. In this study, we represent the distribution itself as multiple paraphrase sentences, which will enrich the model context understanding and trigger it to produce numerous hypotheses. We use a visually grounded paraphrase (VGP), which uses images as a constraint of the concept in paraphrasing, to guarantee that the created paraphrases are within the intended distribution. In this way, our method can also be considered as incorporating image information into NMT without using the image itself. We implement this idea by crowdsourcing a paraphrasing corpus that realizes VGP and construct neural paraphrasing that behaves as expert models in a NMT. Our experimental results reveal that our proposed VGP augmentation strategies showed improvement against a vanilla NMT baseline.
Johanes EFFENDI
Nara Institute of Science and Technology,RIKEN, Center for Advanced Intelligence Project AIP
Sakriani SAKTI
Nara Institute of Science and Technology,RIKEN, Center for Advanced Intelligence Project AIP
Katsuhito SUDOH
Nara Institute of Science and Technology,RIKEN, Center for Advanced Intelligence Project AIP
Satoshi NAKAMURA
Nara Institute of Science and Technology,RIKEN, Center for Advanced Intelligence Project AIP
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Johanes EFFENDI, Sakriani SAKTI, Katsuhito SUDOH, Satoshi NAKAMURA, "Leveraging Neural Caption Translation with Visually Grounded Paraphrase Augmentation" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 3, pp. 674-683, March 2020, doi: 10.1587/transinf.2019EDP7065.
Abstract: Since a concept can be represented by different vocabularies, styles, and levels of detail, a translation task resembles a many-to-many mapping task from a distribution of sentences in the source language into a distribution of sentences in the target language. This viewpoint, however, is not fully implemented in current neural machine translation (NMT), which is one-to-one sentence mapping. In this study, we represent the distribution itself as multiple paraphrase sentences, which will enrich the model context understanding and trigger it to produce numerous hypotheses. We use a visually grounded paraphrase (VGP), which uses images as a constraint of the concept in paraphrasing, to guarantee that the created paraphrases are within the intended distribution. In this way, our method can also be considered as incorporating image information into NMT without using the image itself. We implement this idea by crowdsourcing a paraphrasing corpus that realizes VGP and construct neural paraphrasing that behaves as expert models in a NMT. Our experimental results reveal that our proposed VGP augmentation strategies showed improvement against a vanilla NMT baseline.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7065/_p
Copy
@ARTICLE{e103-d_3_674,
author={Johanes EFFENDI, Sakriani SAKTI, Katsuhito SUDOH, Satoshi NAKAMURA, },
journal={IEICE TRANSACTIONS on Information},
title={Leveraging Neural Caption Translation with Visually Grounded Paraphrase Augmentation},
year={2020},
volume={E103-D},
number={3},
pages={674-683},
abstract={Since a concept can be represented by different vocabularies, styles, and levels of detail, a translation task resembles a many-to-many mapping task from a distribution of sentences in the source language into a distribution of sentences in the target language. This viewpoint, however, is not fully implemented in current neural machine translation (NMT), which is one-to-one sentence mapping. In this study, we represent the distribution itself as multiple paraphrase sentences, which will enrich the model context understanding and trigger it to produce numerous hypotheses. We use a visually grounded paraphrase (VGP), which uses images as a constraint of the concept in paraphrasing, to guarantee that the created paraphrases are within the intended distribution. In this way, our method can also be considered as incorporating image information into NMT without using the image itself. We implement this idea by crowdsourcing a paraphrasing corpus that realizes VGP and construct neural paraphrasing that behaves as expert models in a NMT. Our experimental results reveal that our proposed VGP augmentation strategies showed improvement against a vanilla NMT baseline.},
keywords={},
doi={10.1587/transinf.2019EDP7065},
ISSN={1745-1361},
month={March},}
Copy
TY - JOUR
TI - Leveraging Neural Caption Translation with Visually Grounded Paraphrase Augmentation
T2 - IEICE TRANSACTIONS on Information
SP - 674
EP - 683
AU - Johanes EFFENDI
AU - Sakriani SAKTI
AU - Katsuhito SUDOH
AU - Satoshi NAKAMURA
PY - 2020
DO - 10.1587/transinf.2019EDP7065
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2020
AB - Since a concept can be represented by different vocabularies, styles, and levels of detail, a translation task resembles a many-to-many mapping task from a distribution of sentences in the source language into a distribution of sentences in the target language. This viewpoint, however, is not fully implemented in current neural machine translation (NMT), which is one-to-one sentence mapping. In this study, we represent the distribution itself as multiple paraphrase sentences, which will enrich the model context understanding and trigger it to produce numerous hypotheses. We use a visually grounded paraphrase (VGP), which uses images as a constraint of the concept in paraphrasing, to guarantee that the created paraphrases are within the intended distribution. In this way, our method can also be considered as incorporating image information into NMT without using the image itself. We implement this idea by crowdsourcing a paraphrasing corpus that realizes VGP and construct neural paraphrasing that behaves as expert models in a NMT. Our experimental results reveal that our proposed VGP augmentation strategies showed improvement against a vanilla NMT baseline.
ER -