Stable-Pix2Seq
Stable-Pix2Seq copied to clipboard
about settings of learning rate
In the original paper, learning rate was set to 3e-3 and weight decay was set to 5e-2, why do u use the learning rate 1e-5 and weight decay 1e-4 in the code? BTW, can u give the NLL_Loss when the model convergences, just for reference. Thanks!