bitsandbytes Request to add 4-bit AdamW and 4-bit SGD

Request to add 4-bit AdamW and 4-bit SGD

Open LiutongZhou opened this issue 1 year ago • 4 comments

Paper and Code

Paper: Memory Efficient Optimizers with 4-bit States Code :

https://github.com/thu-ml/low-bit-optimizers/blob/main/lpmm/optim/optimizer.py
https://github.com/thu-ml/low-bit-optimizers/blob/main/lpmm/optim/adamw.py
https://github.com/thu-ml/low-bit-optimizers/blob/main/lpmm/optim/sgd.py

Sep 17 '23 14:09 LiutongZhou

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Dec 20 '23 15:12 github-actions[bot]

raise

Jan 02 '24 21:01 LiutongZhou

Raise,

edit: I am currently no where near good enough at programming to do this, but it would be pretty cool to use some of the bnb paged 8bit adamw code, and 4bit adamw code and make a 4bit paged adamw. It would lower training requirements even lower gb cards, than a current implementation.

It could be possible to full fine tune a 7b model with 4bit optimizer, with a 24 gig card. With gradient accumulation, Based off this chart.

Feb 19 '24 02:02 NicolasMejiaPetit

If there happens to be a branch, or PR for this, I’d love to see it! Could you share a link?

Feb 22 '24 00:02 NicolasMejiaPetit

bitsandbytes bitsandbytes copied to clipboard

Request to add 4-bit AdamW and 4-bit SGD

Paper and Code

bitsandbytes
bitsandbytes copied to clipboard