firefox-translations-training issues

Update Sacrebleu in config-generator

We're using version 2.0 and missing newer datasets (wmt21, wmt22 etc.). It's 2.4 in requirements for the eval step. https://github.com/mjpost/sacrebleu/releases

eu9ene

Alignment fails on en-el tedx/valid dataset

dataset: `sacrebleu_aug-mix_mtedx/valid` It worked fine for `flores_aug-mix_dev` and `mtdata_aug-mix_Neulab-tedtalks_dev-1-eng-ell` https://firefox-ci-tc.services.mozilla.com/tasks/KmZrLAdcQtGnQd8ZdgxDhQ/runs/0 ``` [task 2024-09-03T23:26:56.604Z] tokenizer.json: 0%| | 0.00/1.96M [00:00

eu9ene

bug

tests/test_data_importer.py is the slowest test in CI, due to installing dependencies

It takes 15 minutes to run one test, I think due to the install for the virtual environment that contains cuda and pytorch dependencies. tests/test_data_importer.py::test_basic_corpus_import[mtdata-Neulab-tedtalks_test-1-eng-rus] PASSED [ 29%] https://share.firefox.dev/3ZclPBD

gregtatum

cost & perf

train-student failed with UnicodeDecodeError

1

Is it possible the taskcluster fetches got corrupted? https://firefox-ci-tc.services.mozilla.com/tasks/FC2YNEIiS0mPBWnLTpDQEw/runs/0/logs/public/logs/live.log ``` [task 2024-09-04T10:32:25.194Z] Traceback (most recent call last): [task 2024-09-04T10:32:25.194Z] File "/home/ubuntu/.local/bin/opustrainer-train", line 8, in [task 2024-09-04T10:32:25.194Z] sys.exit(main()) [task 2024-09-04T10:32:25.194Z] File...

eu9ene

bug

evaluate-* is the slowest step in CI, as it's having to download nvidia cudnn and torch

* [Profile of the CI tasks](https://share.firefox.dev/47gHESv) * [evaluate-backward-flores-devtest-ru-en](https://share.firefox.dev/4gfdgvW) | task | runtime | | ---- | ------- | | evaluate-backward-flores-devtest-ru-en | 16m | | evaluate-teacher-ensemble-flores-devtest-ru-en | 18m | nvidia cudnn...

gregtatum

cost & perf

Taskcluster evaluation artifacts on GCP are missing an importer

2

For example: ``` gsutil ls gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student ``` shows ``` gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/Neulab-tedtalks_test-1-eng-lit.en gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/Neulab-tedtalks_test-1-eng-lit.en.ref gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/Neulab-tedtalks_test-1-eng-lit.lt gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/Neulab-tedtalks_test-1-eng-lit.metrics gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_Neulab-tedtalks_test-1-eng-lit.en gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_Neulab-tedtalks_test-1-eng-lit.en.ref gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_Neulab-tedtalks_test-1-eng-lit.lt gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_Neulab-tedtalks_test-1-eng-lit.metrics gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_devtest.en gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_devtest.en.ref gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_devtest.lt gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_devtest.metrics gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_wmt19.en gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_wmt19.en.ref gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_wmt19.lt gs://moz-fx-translations-data--303e-prod-translations-data/models/lt-en/opustrainer_no_student_aug_K1iHndFUSxSEDRLg_H9l1A/evaluation/student/aug-mix_wmt19.metrics ``` For example `aug-mix_wmt19.metrics` should...

eu9ene

bug

en-tr translation quality feedback

9

I've been playing around with en-tr translations and I'd like to share some feedback. I chose [this story](https://learnenglish.britishcouncil.org/general-english/story-zone/a2-b1-stories/devils-details-a2/b1) for a detailed comparison with Google Translate. In the Google docs linked...

selimsum

feedback

Docs deploy failed

https://github.com/mozilla/firefox-translations-training/actions/runs/10688710629/job/29629195406 ``` Error: Ensure GITHUB_TOKEN has permission "id-token: write". ``` I'm not sure what happened to the token.

eu9ene

bug

documentation

Investigate using LLMs for evaluation

1

It would be interesting to compare evaluation capabilities of LLMs to COMET and human evaluation. See the paper: [Large Language Models Are State-of-the-Art Evaluators of Translation Quality](https://arxiv.org/pdf/2302.14520)

eu9ene

LLM

Integrate datasets used for LLM training as monolingual datasets

2

For example https://huggingface.co/datasets/ontocord/CulturaY.

marco-c

data sources

firefox-translations-training
firefox-translations-training copied to clipboard

Metadata

Update Sacrebleu in config-generator

Alignment fails on en-el tedx/valid dataset

tests/test_data_importer.py is the slowest test in CI, due to installing dependencies

train-student failed with UnicodeDecodeError

evaluate-* is the slowest step in CI, as it's having to download nvidia cudnn and torch

Taskcluster evaluation artifacts on GCP are missing an importer

en-tr translation quality feedback

Docs deploy failed

Investigate using LLMs for evaluation

Integrate datasets used for LLM training as monolingual datasets

← Metadata

Owner

Metadata

firefox-translations-training firefox-translations-training copied to clipboard

Metadata

← Metadata

Owner

Metadata

firefox-translations-training
firefox-translations-training copied to clipboard