firefox-translations-training
firefox-translations-training copied to clipboard
Investigate optimizing the CI training run
It would be nice to optimize the end to end time. Here it is 1 hour and 25 minutes https://share.firefox.dev/3I192Z3
The training steps are the ones that take the longest time. We could try using less data, fewer epochs of training, and smaller model sizes.
Step | Time |
---|---|
Teacher | 12:55 |
Student | 12:43 |
Finetune Student | 13:41 |
@eu9ene you mentioned something about quantization failures in a meeting when we discussed this, can you elaborate on that?
It's basically this: https://github.com/mozilla/firefox-translations-training/issues/455
you mentioned something about quantization failures in a meeting when we discussed this, can you elaborate on that?
I'm not sure, we should investigate
Something happens on [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials:
. I see two 20 minutes gaps when this line appears: https://firefox-ci-tc.services.mozilla.com/tasks/TnmPTMeqSPWB727etTAAaw/runs/1/logs/live/public/logs/live.log