RAdam-Tensorflow
RAdam-Tensorflow copied to clipboard

Published 20 hours ago •

taki0112

→

Metadata

Simple Tensorflow implementation of "On The Variance Of The Adaptive Learning Rate And Beyond"

Reame
Issues

Results 3 RAdam-Tensorflow issues

Sort by recently updated

NaN loss during training

1

When I use RAdam in estimator, I encounter 'NaN loss during training' problem. However Adma works fine. ![image](https://user-images.githubusercontent.com/20656474/71705935-e4c06f00-2e1c-11ea-80b3-cb8ed0722d37.png)

Yazhou-Liu

Work in TF 2.0

Hi, Kim. I am also a developer working in the same field. I'm developing in a tf2.0 environment and I wonder if this code will work in this environment either....

yongqyu

Difference in SMA threshold between code and paper

In the algorithm outlined in the original paper, the threshold for whether adapted momentum is applied or not is set to ρt > 4, however looking at the code the...

joeforan76

About

Simple Tensorflow implementation of "On The Variance Of The Adaptive Learning Rate And Beyond"

97

Stars

14

Forks

Watchers

Owner

taki0112

← Metadata

97

Stars

14

Forks

Watchers

Owner

taki0112

Metadata

Simple Tensorflow implementation of "On The Variance Of The Adaptive Learning Rate And Beyond"

Back

RAdam-Tensorflow RAdam-Tensorflow copied to clipboard

Metadata

NaN loss during training

Work in TF 2.0

Difference in SMA threshold between code and paper

← Metadata

Owner

Metadata

RAdam-Tensorflow
RAdam-Tensorflow copied to clipboard