AdamW-pytorch
AdamW-pytorch copied to clipboard
Implementation and experiments for AdamW on Pytorch
Results
1
AdamW-pytorch issues
Sort by
recently updated
recently updated
newest added
If you look carefully at the formula in the article, the weight decay (w) is not multiplied by the learning rate (alpha), but rather by the schedule coefficient (eta). In...