firefox-translations-training icon indicating copy to clipboard operation
firefox-translations-training copied to clipboard

Consider using backward-forward translation for knowledge distillation

Open eu9ene opened this issue 7 months ago • 0 comments

It can help reduce the teacher-student quality gap where we have little monolingual data in the source language.

See: From Research to Production and Back: Ludicrously Fast Neural Machine Translation

We could use monolingual NLLB data in the target language. I looked at it and it's lower quality, so I'm hesitant to use it for back-translations to augment teacher training. It's not a problem if we use this data to produce back-translations only to use them later in forward translation as a part of knowledge distillation.

eu9ene avatar Jul 09 '24 22:07 eu9ene