PixPro icon indicating copy to clipboard operation
PixPro copied to clipboard

Batch size ablation results

Open Yingdong-Hu opened this issue 4 years ago • 2 comments

Hello, thanks for your great work. Can you provide additional ablations obtained using different batch size ? (e.g. smaller batch size 512/256, instead of the 1024 reported in paper) I vary the training batch size but I find that the final result vary a lot.

Yingdong-Hu avatar Mar 31 '21 03:03 Yingdong-Hu

Hi, @Alxead . From our experience, a "sqrt" scheduling method should be used to adjust the learning rate. As our default setting, the learning rate for batch size 1024 is: 1024 / 256 * 1 = 4. With sqrt scheduling, the learning rate for batch size 512 should be: 4 * sqrt(512 / 1024) = 2.828. We can modify the train script with '--base-lr 1.414' to achieve this.

impiga avatar Apr 04 '21 12:04 impiga

Hi, Thank you for your contribution. I was thinking that did you use learning rate decay as the learning rate is quite high and it should reduce as network converges. Thanks, Ram

ramchandracheke avatar Mar 02 '22 14:03 ramchandracheke