firefox-translations-training Support training separate source/target SentencePiece Models

Support training separate source/target SentencePiece Models

Open radinplaid opened this issue 2 years ago • 1 comments

It appears that the pipeline only supports training a joint BPE model, but it is sometimes better to have separate source/target BPE vocabularies

Jul 15 '22 14:07 radinplaid

I would really like to see that too. I work on language pairs with no overlap between the src and target character set, and so a separate tokenization model for each makes sense.

Jul 27 '22 14:07 AmitMY

firefox-translations-training firefox-translations-training copied to clipboard

Support training separate source/target SentencePiece Models

firefox-translations-training
firefox-translations-training copied to clipboard