Benoit Sklénard
Benoit Sklénard
Hi, I am also quite interested in the multi-GPU training capbility. I did some tests with the ddp branch using PyTorch 2.1.1 up to 16 GPUs (4 V100 per node)...
Hello @vroulet , Indeed, removing the bias correction on the 2nd moment in both versions aligns the 2 implementations and it actually corresponds to the [original AMSGrad paper](https://openreview.net/pdf?id=ryQu7f-RZ). I have...
Yes, it is a good idea to add an option to remove bias correction. I am wondering whether it would also make sense to also include an option to get...