Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

Bth5032/78 blackcat trainer

Open bth5032 opened this issue 2 years ago • 3 comments

#78

As per discussion with @theblackcat102 I built the rankgen trainer on top of their framework (wandb). The model seems to be training now in fp32. Apparently t5 has some issue with fp16. Blackcat suggested scaling the weights might help as in 1. This could be a good next step if we want to move with this model. Until then, I think this code is worth comitting since it shows how to add new models which are not AutoModelForSequenceClassification

bth5032 avatar Jan 03 '23 06:01 bth5032

@bth5032 your wandb loss looks good, but I wonder why is the accuracy validation missing?

Not trying to nickpick anyway, but just curious why not assign the default value for tokenizer_name is same as model_name at the argument_parser then we can remove this if else at the trainer here

Overall the trainer looks fine

theblackcat102 avatar Jan 03 '23 12:01 theblackcat102

@bth5032 thank you! could you run pre-commit run --all-files to make linters happy?

yk avatar Jan 03 '23 14:01 yk

@bth5032 thank you! could you run pre-commit run --all-files to make linters happy?

Thanks!

Not trying to nickpick anyway, but just curious why not assign the default value for tokenizer_name is same as model_name at the argument_parser then we can remove this if else at the trainer here

Ah, I normally use hydra/omegaconf for this, didn't realize you had that utility function, I cleaned that logic up.

@bth5032 your wandb loss looks good, but I wonder why is the accuracy validation missing?

I've been trying to figure that out myself. The compute_metrics function never runs for me... I found this thread and tried setting evaluation_strategy=IntervalStrategy.STEPS and a few other things. But ultimately I verified the condition they listed here does evaluate to true in prediction_step. However I still don't ever see compute_metrics being called...

I'll keep looking into it and submit a PR if I figure it out.

@theblackcat102 @sanagno should be ready for merge.

bth5032 avatar Jan 04 '23 02:01 bth5032