optax icon indicating copy to clipboard operation
optax copied to clipboard

Add a mathematical description of the algorithms

Open vroulet opened this issue 1 year ago • 0 comments

Like adam or nadam, it would be nice to have mathematical descriptions of as many algorithms as possible. To start with, having a clear description of what sgd with momentum and nesterov is would be very good. Algorithms to do if possible below (sometimes the description may be too long). Refer to the reference each time (on arxiv you can even extract the source to potentially simply copy-paste the algorithm but make sure it matches the implementation).

  • [x] SGD: https://github.com/google-deepmind/optax/pull/830
  • [x] AdaBelief: https://github.com/google-deepmind/optax/pull/869
  • [ ] Adagrad
  • [ ] Adafactor
  • [x] Adamax, Adamaxw: https://github.com/google-deepmind/optax/pull/918
  • [x] AdamW: https://github.com/google-deepmind/optax/pull/894
  • [ ] AMSGrad
  • [ ] Fromage
  • [ ] Lamb
  • [ ] Lars
  • [ ] Lion
  • [x] Noisy SGD: https://github.com/google-deepmind/optax/pull/857
  • [ ] Novograd
  • [ ] OptimisticGD
  • [ ] DifferentiallyPrivateSGD
  • [ ] Radam
  • [ ] RMSProp
  • [ ] SM3
  • [ ] Yogi

vroulet avatar Feb 02 '24 12:02 vroulet