Alan Q. Wang
Results
1
comments of
Alan Q. Wang
Digging deeper, it seems that setting lr=1e-3 and wd=1e-4 is necessary to get reasonable results (i.e. non-degenerate solutions). It seems these are the hyperparameters recommended by most papers using this...