pytorch-optimizer
pytorch-optimizer copied to clipboard
AdamD implementation (or option to skip bias-correction to adam-derived optimizers)?
I recently put out a proposal to add an argument to adam-derived optimizers to skip the bias-correction term on w
, only applying it to v
. See the figure attached in the issue https://github.com/pytorch/pytorch/issues/67105 and the write-up I put together for theoretical justification AdamD: Improved bias-correction in Adam. Since it's still too early in the idea's existence to add this to the pytorch repo (according to them), your repo seems like a reasonable home for it. I am happy to send you a PR, but I would like to hear what you would prefer:
- A new optimizer, AdamD and AdamDW (mirroring Adam/AdamW but with the bias-correction on the
w
term step excluded). - An otherwise vanilla fork of Adam/AdamW, with a boolean flag allowing the user to turn the bias-correction on/off, as well as adding this option to the relevant optimizers already included in this repo. I have not read through it carefully but this would likely include Lamb (it would be an option to enable bias-correction on
v
only, since it is already excluded otherwise), AdamP, and maybe others.
Let me know how you would like to proceed, or if you want any further clarification!
I will be happy to accept PR, I like option 1 seems like more clear API. Internally if possible implementation should share code if possible.