firefox-translations-training icon indicating copy to clipboard operation
firefox-translations-training copied to clipboard

Training pipelines for Firefox Translations neural machine translation models

Results 311 firefox-translations-training issues
Sort by recently updated
recently updated
newest added

chrF is now considered more reliable than BLEU, and should work better for CJK Based on advice from #748 + unify sacrebleu and mtdata versions everywhere closes #748

Does it require any adjustment? Do our metrics (chrF, COMET, BLEU) work correctly for these languages?

language-coverage

Use custom OpusCleaner configs with disabled word-based filters. The filters are copied from https://github.com/hplt-project/HPLT-MT-Models/blob/main/v1.0/data/en-zh_hant/raw/v2/HPLT-v1.1.en-zh_hant.filters.json. I don't think it's feasible to do the src-trg-ratio that requires tokenization now. We would have...

- character coverage - size closes #745

See comments from Jaume: https://github.com/mozilla/firefox-translations-training/issues/45#issuecomment-1036191497 https://github.com/mozilla/firefox-translations-training/issues/45#issuecomment-1036198055

language-coverage

Does decoding, extract-best and other procedures for translation work the same way for CJK?

language-coverage

Does it require and modifications?

language-coverage

I don't have a good understanding of why some lines are suddenly empty as a result of running "extract_lex". There are just a few of them and the model trained...

@gregtatum The goal of this patch is to move much of the functionality from the [build-bergamot.py](https://searchfox.org/mozilla-central/rev/dca2603d55b5b39d3b8ab8e93c08b42563f5aad8/toolkit/components/translations/bergamot-translator/build-bergamot.py) script in Mozilla Central upstream into this repository to better streamline how WASM artifacts...

inference