esvit icon indicating copy to clipboard operation
esvit copied to clipboard

Question about the Learning Rate used for pretraining

Open Annbless opened this issue 3 years ago • 0 comments

Hello.

Thank you for the wonderful work! I have some questions about the learning rate used to pretrain the Swin model in Table 1. As the logs show, the learning rate for the Swin-T model is 0.0005180447994195404 at 201 epoch, while the learning rate for the Swin-S/B model is 0.00025939212681290886 at 201 epoch. however, the parameters shown for the 'args' keyword in the pre-trained model are the same.

Could you please tell me why there is a difference in learning rate in the training log?

Thanks in advance.

Annbless avatar Feb 25 '22 06:02 Annbless