firefox-translations-training icon indicating copy to clipboard operation
firefox-translations-training copied to clipboard

Translation settings for backtranslation are suboptimal

Open XapaJIaMnu opened this issue 2 years ago • 3 comments

https://github.com/mozilla/firefox-translations-training/blob/03a2ddaa3f7d9c9af3a236bb2dbb94db36c16373/pipeline/translate/translate.sh#L22

When performing backtranslation, we want slightly different settings for the decoder as we should be doing output sampling as opposed to beam search. Relevant marian setting: https://github.com/marian-nmt/marian-dev/blob/master/src/common/config_parser.cpp#L711 -b 1 --output-sampling

This setting is only relevant to backtranslation. Forwardtranslation should be kept "as is"

XapaJIaMnu avatar May 01 '22 16:05 XapaJIaMnu

@XapaJIaMnu could you please clarify how this will help?

eu9ene avatar Jun 07 '22 23:06 eu9ene

https://aclanthology.org/D18-1045/ tl;dr Backtranslations produced by sampling from the output distribution result in overall better translation quality than beam search produced backtranslations.

XapaJIaMnu avatar Jun 08 '22 11:06 XapaJIaMnu