firefox-translations-training
firefox-translations-training copied to clipboard
Translation settings for backtranslation are suboptimal
https://github.com/mozilla/firefox-translations-training/blob/03a2ddaa3f7d9c9af3a236bb2dbb94db36c16373/pipeline/translate/translate.sh#L22
When performing backtranslation, we want slightly different settings for the decoder as we should be doing output sampling as opposed to beam search. Relevant marian setting:
https://github.com/marian-nmt/marian-dev/blob/master/src/common/config_parser.cpp#L711
-b 1 --output-sampling
This setting is only relevant to backtranslation. Forwardtranslation should be kept "as is"
@XapaJIaMnu could you please clarify how this will help?
https://aclanthology.org/D18-1045/ tl;dr Backtranslations produced by sampling from the output distribution result in overall better translation quality than beam search produced backtranslations.