ibot
ibot copied to clipboard
Larger effective batch size?
For vit-s training, the batch size per gpu is 64 and the world size is 16, making the effective batch size 1024. Does the training become unstable if the effective batch size is increased beyond this number? say 1280.