optax icon indicating copy to clipboard operation
optax copied to clipboard

Add Implementation of _Lion Optimizer (Evolved sign Momentum)

Open raghulchandramouli opened this issue 1 month ago • 8 comments

Summary

I'd like to contribute an Optax Implementation of Lion Optimizer, i.e a gradient transformation and a convenience Lion(...) a wrapper in contrib that composes decoupled weight decay and learning-rate scaling. It tracks a single momentum and uses sign(...) of an interpolation for updates as described in the paper https://arxiv.org/abs/2302.06675

What will I include:

  1. Implementation file in (Optax/contrib/_lion.py)
  2. Test file in (Optax/contrib/_lion_test.py)
  3. a quick Note about fp16 behaviour and suggestions for recommended dtype handling

Request

  1. Guidance on, Would maintaniers be open to this style of Contributions placed under Optax/contrib
  2. Any specific tests, coding style or helper utils
  3. I can open a PR + Implementations/tests,

Thanks - I'm happy to iterate quickly based on feedback

raghulchandramouli avatar Oct 24 '25 14:10 raghulchandramouli

I have raised a PR regd this @vroulet can you please take a look at it?

raghulchandramouli avatar Oct 24 '25 17:10 raghulchandramouli

Linking the PR for visibility: https://github.com/google-deepmind/optax/pull/1438

rdyro avatar Oct 24 '25 18:10 rdyro

Thanks a lot @rdyro for doing it

raghulchandramouli avatar Oct 24 '25 20:10 raghulchandramouli

Did you check this https://optax.readthedocs.io/en/latest/api/optimizers.html#optax.lion ?

vroulet avatar Oct 24 '25 22:10 vroulet

Hey thanks a lot for letting me know, I will re-implement _lion, with smooth_sign, in the existing implementation it uses a hard sign function (jnp.sign()), which has limitations

  • Poor Gradient flow
  • Discontinuity in training via instability

@vroulet let me know if i can do this implementation? Instead and thanks a lot for pointing lion implementation,

raghulchandramouli avatar Oct 25 '25 05:10 raghulchandramouli

Can you point to a resource that proposes to use smooth sign in lion?

rdyro avatar Oct 26 '25 23:10 rdyro

sure @rdyro https://www.researchgate.net/publication/385679808_RLion_A_Refined_Lion_Optimizer_for_Deep_Learning this is the paper that uses smooth sign, - it replaces the discrete sign(-) update of lion optimizer with a continuous bounded function (arctan) to smooth out the fluctuation

raghulchandramouli avatar Oct 27 '25 13:10 raghulchandramouli

please take a look at this and let me know, how would you like me to go ahead

raghulchandramouli avatar Oct 27 '25 13:10 raghulchandramouli