DemonRangerOptimizer
DemonRangerOptimizer copied to clipboard
Quasi Hyperbolic Rectified DEMON Adam/Amsgrad with AdaMod, Gradient Centralization, Lookahead, iterative averaging and decorrelated Weight Decay
Results
2
DemonRangerOptimizer issues
Sort by
recently updated
recently updated
newest added
https://github.com/pytorch/pytorch/commits/master/torch/optim/adam.py Where is a lot of changes like no grads cope decorator. Not sure if it affects math/performance. For your consideration.
I tried it in two tasks, but got nans during training, any suggestions?