lion-pytorch
lion-pytorch copied to clipboard
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch
One of the main benefits of LION, is it needs to save less data for each param. Adam needs to save Momentum and RMSProp ema's, while in LION we need...
I've been experimenting with the LION optimizer in your other (great) Imagen repository. I can share my anecdotal experience and combinations: - Models of different sizes 0.2B, 0.7B and 1B...
Hi, thx for your great work! I set `use_triton=True`, and turned on automatic mixed precision training, but `inf` appeared in the results. Does the `lion_pytorch/triton.py` need to consider `bf16` or...
Pytorch has AMD ROCM builds. How can lion-pytorch use those?
Thank you so much for your great implementations! My collabrators and I have recently onlined a manuscript (available at [https://arxiv.org/abs/2307.10053](https://arxiv.org/abs/2307.10053)) that provides convergence guarantees for Lion optimizer, especially in the...
https://github.com/TimDettmers/bitsandbytes/blob/main/compile_from_source.md
Hi @lucidrains, thanks for this implementation. I wonder if you're using distributed training for your [experiments](https://wandb.ai/lucidrains/lion-test/reports/Lion--VmlldzozNTY0OTQ0?accessToken=wxt5ha81c05k26zq01b51j3ondpzsfd1sfmng8x94g16vul5gnxq32zcjdzp5oel). If so, [as noted in Accelerate's docs](https://huggingface.co/docs/accelerate/concept_guides/performance#learning-rates), do you scale your learning rate (on...