pytorch-optimizer icon indicating copy to clipboard operation
pytorch-optimizer copied to clipboard

LAMB: Differences from the paper author's official implementation

Open binmakeswell opened this issue 5 years ago • 3 comments

The LAMB implementation of the PyTorch version you released is different from the official version of TensorFlow released by the paper author. According to the official implementation published in the paper, the author's code implementation skips some parameters according to their names() when calculating. But in your implementation, it seems that all parameters are directly involved in the calculation. For example, exclude_from_weight_decay=["batch_normalization", "LayerNorm", "layer_norm"] Their implementation: https://github.com/tensorflow/addons/blob/master/tensorflow_addons/optimizers/lamb.py

binmakeswell avatar Nov 20 '20 12:11 binmakeswell

I suspect there is something wrong with this implementation. When I used LAMB in MXNet, it was always good enough, but here...

EmilPi avatar Apr 24 '21 19:04 EmilPi

I also tried Lamb for https://github.com/coqui-ai/TTS/ on PyTorch 1.9 but didn't even lower the training loss curve.

erogol avatar Jul 21 '21 08:07 erogol

Not sure when I will be able to take a look, but happy to accept PRs with fixes.

jettify avatar Oct 02 '21 15:10 jettify