RAdam-Tensorflow icon indicating copy to clipboard operation
RAdam-Tensorflow copied to clipboard

Simple Tensorflow implementation of "On The Variance Of The Adaptive Learning Rate And Beyond"

Results 3 RAdam-Tensorflow issues
Sort by recently updated
recently updated
newest added

When I use RAdam in estimator, I encounter 'NaN loss during training' problem. However Adma works fine. ![image](https://user-images.githubusercontent.com/20656474/71705935-e4c06f00-2e1c-11ea-80b3-cb8ed0722d37.png)

Hi, Kim. I am also a developer working in the same field. I'm developing in a tf2.0 environment and I wonder if this code will work in this environment either....

In the algorithm outlined in the original paper, the threshold for whether adapted momentum is applied or not is set to ρt > 4, however looking at the code the...