Adabelief-Optimizer issues

Inconsistent computation of weight_decay and grad_residual among pytorch versions

5

Hi I was looking at the various versions you have in the `pypi_packages` folder and noticed that the order of computation of weight decay (which for some options modifies `grad`)...

sjscotti

Documentation (at least for TF) and weight_decouple is not an option

2

Hiya, In the ReadME you say that Rectify is implemented as an option but the default is True. I would update the ReadME to reflect that. You also make it...

grofte

support for tensorflow 1.10+

8

will adabelief support for tensorflow 1.10+

chenxinhua

Implementation of pure keras

6

Do you have pure keras implementation version? Thanks

liaoxuanzhi

Remove in-place add of eps

1

Remove in-place add of eps. Use in-place div for improved performance

vpj

Suppressing weight decoupling and rectification messages

1

Is there a way to suppress these messages by setting some parameters explicitly when they are enabled? ``` Weight decoupling enabled in AdaBelief Rectification enabled in AdaBelief ``` I skimmed...

gunsodo

AttributeError: 'AdaBeliefOptimizer' object has no attribute '_set_hyper'

4

I'm facing the problem with `AdaBeliefOptimizer` `AttributeError: 'AdaBeliefOptimizer' object has no attribute '_set_hyper'` `optimizer = AdaBeliefOptimizer(learning_rate=1e-3, epsilon=1e-14, rectify=False)`

SamMohel

The problem of reproducing the result of ImageNet

4

Recently I try reproducing the result in the paper. I successfully did this on cifar10 and GAN, but the test accuracy on ImageNet is nearly 69.5%, which on the paper...

KaltsitI

loss become nan when beta1=0

Hello ,when I use Adabelief with beta1 = 0 , beta2 = 0.999(SAGAN、BIGGAN、WGAN-GP) ,the loss becomes nan , while Adam works well.I am wondering whether if the hyper parameter needs...

yojeep

adapt to Tensorflow >= 2.11

As of [TF 2.11](https://www.gitclear.com/open_repos/tensorflow/tensorflow/release/v2.11.0), Optimizer must be imported from `tf.keras.optimizers.legacy` instead of `tf.keras.optimizers`. I have tested with TF 2.13.1 on AdaBelief_tf2_test.py from [the 0.3.0 branch](https://github.com/juntang-zhuang/Adabelief-Optimizer/blob/59691466bbe8cc5b3ae5c69b012e78befaaa35e1/pypi_packages/adabelief_tf0.3.0/adabelief_tf2/AdaBelief_tf2_test.py) (with a slight modification of...

bertsky

Adabelief-Optimizer
Adabelief-Optimizer copied to clipboard

Metadata

Inconsistent computation of weight_decay and grad_residual among pytorch versions

Documentation (at least for TF) and weight_decouple is not an option

support for tensorflow 1.10+

Implementation of pure keras

Remove in-place add of eps

Suppressing weight decoupling and rectification messages

AttributeError: 'AdaBeliefOptimizer' object has no attribute '_set_hyper'

The problem of reproducing the result of ImageNet

loss become nan when beta1=0

adapt to Tensorflow >= 2.11

← Metadata

Owner

Metadata

Adabelief-Optimizer Adabelief-Optimizer copied to clipboard

Metadata

← Metadata

Owner

Metadata

Adabelief-Optimizer
Adabelief-Optimizer copied to clipboard