firefox-translations-training issues

Support extra metrics from Tensorboard

I noticed that I have some extra metrics generated by marian tensorboard that are missing on W&B dashboards. They are all useful. The missing ones: - valid/bleu-detok_stalled - valid/ce-mean-words_stalled -...

eu9ene

platform

Display corpus size in W&B

3

We should display things we look at often in W&B. Final merged corpus size after deduplication is something I look at periodically to understand how aggressive the cleaning is overall....

eu9ene

platform

[meta] Train RTL languages like Arabic and Hebrew

RTL languages shouldn't affect training, but doing so will require some work on the Firefox side. This meta bug tracks any work that is needed. We should complete a subset...

gregtatum

epic

language-coverage

[meta] Train easy to segment LTR languages

2

In the short term we are focusing on building up our language list by training easy to segment LTR languages, as they don't require changes to the training pipeline, and...

gregtatum

epic

language-coverage

COMET results are not visible on custom charts

Each metric should use its own scale.

eu9ene

platform

Consider using backward-forward translation for knowledge distillation

It can help reduce the teacher-student quality gap where we have little monolingual data in the source language. See: [From Research to Production and Back: Ludicrously Fast Neural Machine Translation](https://aclanthology.org/D19-5632.pdf)...

eu9ene

quality

Exclude start stage tasks from existing tasks

2

This fixes the edge case where we have alignments-original -> alignments-backtranslated both marked as `stage: alignments-teacher` and want to restart both of them. We should probably split them later to...

eu9ene

Configs stage2

2

Latest config updates. Replace en-uk config with the one used to train the models (it lacks extra mono data).

eu9ene

Duplicate runs in W&B

1

https://wandb.ai/moz-translations/tr-en/workspace?nw=nwuserepavlov https://firefox-ci-tc.services.mozilla.com/tasks/groups/SDD81N6sRu61LOL4xZJc-Q

eu9ene

bug

platform

start_stage often reruns amost all "evaluate" tasks

4

I ran this one to do the export task but since "evaluate" tasks are not sequential it leads to rerunning them each time I use start_stage which wastes GPU resources....

eu9ene

taskcluster

cost & perf

firefox-translations-training
firefox-translations-training copied to clipboard

Metadata

Support extra metrics from Tensorboard

Display corpus size in W&B

[meta] Train RTL languages like Arabic and Hebrew

[meta] Train easy to segment LTR languages

COMET results are not visible on custom charts

Consider using backward-forward translation for knowledge distillation

Exclude start stage tasks from existing tasks

Configs stage2

Duplicate runs in W&B

start_stage often reruns amost all "evaluate" tasks

← Metadata

Owner

Metadata

firefox-translations-training firefox-translations-training copied to clipboard

Metadata

← Metadata

Owner

Metadata

firefox-translations-training
firefox-translations-training copied to clipboard