Ranger-Deep-Learning-Optimizer icon indicating copy to clipboard operation
Ranger-Deep-Learning-Optimizer copied to clipboard

Is adabelief the best optimizer?

Open LifeIsStrange opened this issue 4 years ago • 7 comments

https://paperswithcode.com/paper/adabelief-optimizer-adapting-stepsizes-by-the

LifeIsStrange avatar Oct 23 '20 22:10 LifeIsStrange

"This work considers the update step in first-order methods. Other directions include Lookahead [42] which updates “fast” and “slow” weights separately, and is a wrapper that can combine with other optimizers; variance reduction methods [43, 44, 45] which reduce the variance in gradient; and LARS [46] which uses a layer-wise learning rate scaling. AdaBelief can be combined with these methods. Other variants of Adam have been proposed (e.g. NosAdam [47], Sadam [48] and Adax [49])."

LifeIsStrange avatar Oct 23 '20 22:10 LifeIsStrange

I tested adabelief on my task, it is worse than ranger.

hiyyg avatar Dec 28 '20 08:12 hiyyg

@hiyyg Could you post your task, network, and hyper-params of two optimizers for your task?

juntang-zhuang avatar Aug 08 '21 01:08 juntang-zhuang

It was an internal task, sorry I can not share it. The hyper params are all the default for both optimizers.

hiyyg avatar Aug 08 '21 02:08 hiyyg

@hiyyg which version of adabelief did you use? Not sure if it's caused by eps, quickly skimming over the ranger code, default uses eps=1e-5, equivalent to eps=1e-10 for AdaBelief. The most recent (0.2) default eps is 1e-16 for AdaBelief, equivalent to an eps=1e-8 for Adam. The difference in eps is crucial for adaptive optimizers, this could be the reason causing the performance difference.

juntang-zhuang avatar Aug 08 '21 02:08 juntang-zhuang

Thanks. I guess I used the version around 28 Dec 2020. I think your information might be very useful for users who want to compare Adabelief with Ranger.

hiyyg avatar Aug 08 '21 03:08 hiyyg

Thanks for the info. 28 Dec 2020 is about v0.1 and the default eps=1e-16 for AdaBelief

juntang-zhuang avatar Aug 08 '21 15:08 juntang-zhuang