ImageReward Strange training dynamics for ImageReward model.

Strange training dynamics for ImageReward model.

Open bhattg opened this issue 2 years ago • 3 comments

Hi! I am trying to train a reward model, and I am confused why in the initial iterations of training the gradients are not changing and neither the loss is changing. Only after some steps does it suddenly change and then learning is completed.

Following is the attached learning dynamics. Screen Shot 2023-10-08 at 6 21 46 PM

Oct 09 '23 01:10 bhattg

Hello, which version of python and cuda are you using? Thank you.

Oct 17 '23 02:10 learn01one

This is a very interesting discovery, and I believe it may be related to the learning rate schedule and warm-up settings, although there could be other factors worth exploring.

Nov 05 '23 07:11 xujz18

Hello, sorry I couldn't get back with the question on python version 3.10.13 and CUDA 11.7

Experiment was run using torch 1.13.0

Regarding the learning dynamics, I am using the following

--fix_rate 0.7 --lr 1e-05 --lr-decay-style cosine --warmup 0.0 --batch_size 32 --accumulation_steps 1 --epochs 50

Nov 06 '23 18:11 bhattg

ImageReward ImageReward copied to clipboard

Strange training dynamics for ImageReward model.

ImageReward
ImageReward copied to clipboard