optax
optax copied to clipboard
Documentation for cosine decay schedule
Hello,
The formula in the documentation for the cosine_decay_schedule
(https://optax.readthedocs.io/en/latest/api/optimizer_schedules.html#optax.cosine_decay_schedule) would suggest that the learning rate increases again after T steps.
A quick look at the code confirms this is not the case, but it may be good to write it explicitly, as in linear_schedule
.
Happy to make a short PR! I also could propose a short formula/pseudocode for functions like piecewise_constant_schedule
that do not have one.
Best
GJ
Hello @gjhuizing,
Thanks for catching this! If you are willing to do such a PR that would be great!
Great!
Solved in https://github.com/google-deepmind/optax/pull/1032