firefox-translations-training issues

Logs disappear after a task fails

2

taskcluster

tc-p1

fix(snakefile): correctly locate files to translate

Similar to #402 In multiple places (e.g. https://github.com/mozilla/firefox-translations-training/blob/main/Snakefile#L549) there is an explicit file path to a `.gz` extension. While I use `.gz`, the `split` command supports other extensions as well....

AmitMY

snakemake

Do not run bicleaner step if the threshold is 0

It takes some time to spin up a multi GPU machines and then to download the artifacts and install the dependencies. It just exists if the threshold is 0. We...

eu9ene

cost & perf

Full audit of resource utilization and adjustments

We should run the whole pipeline and analyze GCP dashboards. The first candidate for optimization is GPU machines for training. We don't shuffle the training dataset in memory anymore, so...

eu9ene

cost & perf

Incompatible python version

1

For bicleaner, you use https://github.com/mozilla/firefox-translations-training/blob/main/pipeline/bicleaner/download_pack.py#L93 ```py def main(args: Optional[list[str]] = None) -> None: ``` which is not allowed in this environment, defined [here](https://github.com/mozilla/firefox-translations-training/blob/main/envs/bicleaner.yml) to use python 3.7 > Traceback (most...

AmitMY

Getting a quick sample of data from the artifacts

2

When inspecting the running pipeline I have to download artifacts on the local machine quite often. I essentially do: `wget http://artifact.zst` `zstd -dc artifact.zst | head -n 100` or similar....

eu9ene

enhancement

taskcluster

GPU workers still not always handling preemptions properly

4

We recently upgraded [worker-runner](https://github.com/taskcluster/taskcluster/tree/main/tools/worker-runner) on the GPU workers to a version that is supposed to gracefully handle spot preemptions. Most notably, it should be uploading artifacts before an instance terminates....

bhearsum

taskcluster

fix Dataset importer problems

4

fix #338

AmitMY

Tune workspace dashboard to enable comparison across models and experiments

3

We have a great bar chart to compare across the models for one experiment (runs group). We should figure out how to tune this dashboard or rename the steps so...

eu9ene

platform

Add Hugging Face data importer

2

Apparently we already use some monolingual data from there as a custom corpus based on @gregtatum's investigation. Also we have a tool to list the available data https://github.com/mozilla/firefox-translations-training/pull/397

eu9ene

data

firefox-translations-training
firefox-translations-training copied to clipboard

Metadata

Logs disappear after a task fails

fix(snakefile): correctly locate files to translate

Do not run bicleaner step if the threshold is 0

Full audit of resource utilization and adjustments

Incompatible python version

Getting a quick sample of data from the artifacts

GPU workers still not always handling preemptions properly

fix Dataset importer problems

Tune workspace dashboard to enable comparison across models and experiments

Add Hugging Face data importer

← Metadata

Owner

Metadata

firefox-translations-training firefox-translations-training copied to clipboard

Metadata

← Metadata

Owner

Metadata

firefox-translations-training
firefox-translations-training copied to clipboard