firefox-translations-training
firefox-translations-training copied to clipboard
Training pipelines for Firefox Translations neural machine translation models
# Experiment insights ## OpusCleaner - legacy cleaning slightly outperforms all OpusCleaner configs (likely due to num_mismatch filter in OpusCleaner) - large FastText model significantly reduces false positives compared to...
It would be nice to have: * File path `--file` * URL `--url` * Remote settings collection `--remote-settings en-ca --version 1.0a1` For remote settings we would need to use the...
- Make sure that the integrated [OpusFilter](https://helsinki-nlp.github.io/OpusFilter/index.html) works - Produce configs with OpusFilter - Compare results to regular OpusCleaner based configs
```[tasklist] ### Bugs - [ ] https://github.com/mozilla/firefox-translations-training/issues/688 - [ ] https://github.com/mozilla/firefox-translations-training/issues/716 - [ ] https://github.com/mozilla/firefox-translations-training/issues/846 - [ ] https://github.com/mozilla/firefox-translations-training/issues/862 - [ ] Check that COMET is published from the recently...
We are especially interested in publishing the full training live.log as a file to W&B artifacts or logs (wherever it will be more convenient to view it). This can be...
This includes publishing: - live training logs to W&B dashboards I assume we'll have separate publishing scripts for other things. Let's use [Taskgraph transforms](https://taskcluster-taskgraph.readthedocs.io/en/latest/concepts/transforms.html) not to pollute Taskcluster kinds with...
Some weird things I noticed in https://wandb.ai/moz-translations/lt-en: - teacher-ensemble evals is empty - group logs doesn't have any metrics - group logs is missing for some groups - quantized is...