firefox-translations-training Teacher does not continue training after pretraining on augmented corpus

Teacher does not continue training after pretraining on augmented corpus

Open eu9ene opened this issue 2 years ago • 1 comments

I continue testing the pipeline and I see that almost all teacher models don't continue training even after I increased patience by setting early-stopping: 20.

Currently, continuation happens by training new models on a parallel corpus using --pretrained-model and model.npz.best-chrf.npz of the teacher that was pre-trained on an augmented corpus for 2 epochs.

Also, I see that somehow quality of every continuation model is a bit worse than for pre-trained model, and we use the continuation model for translation further in the pipeline.

I went with this approach after having constant workflow issues with continuation in the same folder. It seems this is not correct. Maybe we should copy model.npz.optimizer.npz or an entire directory instead of using the --pretrained-model flag? @kpu @XapaJIaMnu

Feb 25 '22 21:02 eu9ene

Increasing early stopping thresholds can help, but it still does not properly fine tune on some languages, I assume because of low quality of the data.

  training-teacher-base:
    # remove for low resource languages or if training without augmentation
    after: 2e
    early-stopping: 20
  training-teacher-finetuned:
    early-stopping: 40

Jun 09 '22 18:06 eu9ene

@eu9ene Is this bug actionable? Should we close without more specific things to focus on?

Apr 09 '24 21:04 gregtatum

It's essentially the same as https://github.com/mozilla/firefox-translations-training/issues/472. Now we train everything in one run with OpusTrainer and don't have a problem using the worse fine-tuned model since the pre-trained checkpoint will be used if it doesn't continue training.

Apr 09 '24 21:04 eu9ene

firefox-translations-training firefox-translations-training copied to clipboard

Teacher does not continue training after pretraining on augmented corpus

firefox-translations-training
firefox-translations-training copied to clipboard