AlignProp icon indicating copy to clipboard operation
AlignProp copied to clipboard

noise after certain number of epochs

Open sachinnitw1317 opened this issue 1 year ago • 5 comments

Hi,

I did some experiments to reproduce your results, but the model seems to lose all context after a certain number of epochs.

I am attaching the report here https://wandb.ai/sachin931350/align-prop/runs/ngkluhfs/overview

Please let me know what am i doing wrong

sachinnitw1317 avatar Oct 12 '23 03:10 sachinnitw1317

I saw your config, i normally use:

total_samples_per_epoch=256 total_batch_size= 128

I think you are using much lower numbers for these, can you try with setting the above numbers?

mihirp1998 avatar Oct 12 '23 04:10 mihirp1998

If you have made any other changes in the config then let me know

mihirp1998 avatar Oct 12 '23 04:10 mihirp1998

other configs are the same. I reduced this to run on a T4 machine

Let me try with total_samples_per_epoch=256 total_batch_size= 128

sachinnitw1317 avatar Oct 12 '23 04:10 sachinnitw1317

Probably reducing batch size might work, but i think you should also try reducing the learning rate with it.

I think this issue might be happening due to high lr

mihirp1998 avatar Oct 12 '23 05:10 mihirp1998

I have started another run with a batch size of 128 as you suggested, all the other settings are same except capacity per GPU

Will know the result in a couple of hours

sachinnitw1317 avatar Oct 12 '23 05:10 sachinnitw1317