convnetjs
convnetjs copied to clipboard
Fix adam optimizer
Hi Andrej,
I recently ran the trainer demo on MNIST and wondered why the Adam optimizer performs so much more worse than Adadelta.
I think I found a little bug in the Adam implementation.
According to the Adam Paper-v8 https://arxiv.org/pdf/1412.6980v8.pdf Algorithm 1 (p. 2) the bias estimates use division instead of multiplication. The fixed version behaves significantly better when running the trainer demo on MNIST. To get the results as below I also changed the learning rate to 0.001 and the beta2 parameter to 0.999 (from 0.01 and 0.99 respectively) as recommended in the paper.
Before:
After:
Thank you so much! Can someone please proof this and push this into release (on npm too please)?