optax icon indicating copy to clipboard operation
optax copied to clipboard

Default epsilon values for AdaBelief don't match the paper

Open danijar opened this issue 3 years ago • 0 comments

In the AdaBelief paper, there is only one epsilon = 1e-8 that is used both to damp the second moment estimate and as constant in the denominator. In Optax, there are instead eps = 1e-16 and root_eps = 1e-16. Initially, I just set eps = 1e-8 in the hope to match the paper, but just no noticed that I also need to set root_eps = 1e-1. A few ideas how this might be improved:

  • Add a note in the documentation
  • Use the defaults eps = 1e-8 and root_eps = None and the set if root_eps is None: root_eps = eps
  • At least default eps = 1e-8 and root_eps = 1e-8 Is there a particular reason the implementation uses different default hparams?

danijar avatar Oct 16 '22 17:10 danijar