Adam-experiments icon indicating copy to clipboard operation
Adam-experiments copied to clipboard

How to choose wd?

Open MohitLamba94 opened this issue 3 years ago • 2 comments

Thankyou for this wonderful benchmarking.

In several experiments wd=1.2e-6. Can you please give some guidelines or rule of thumb in choosing the hyperparameter for weight decay?

MohitLamba94 avatar Jan 18 '21 08:01 MohitLamba94

@MohitLamba94

Any update?

twmht avatar May 05 '22 10:05 twmht

@MohitLamba94

Any update?

Sorry. I did not look into into any further.

MohitLamba94 avatar May 05 '22 11:05 MohitLamba94