bitsandbytes icon indicating copy to clipboard operation
bitsandbytes copied to clipboard

Request to add 4-bit AdamW and 4-bit SGD

Open LiutongZhou opened this issue 1 year ago • 4 comments

Paper and Code

Paper: Memory Efficient Optimizers with 4-bit States Code :

  • https://github.com/thu-ml/low-bit-optimizers/blob/main/lpmm/optim/optimizer.py
  • https://github.com/thu-ml/low-bit-optimizers/blob/main/lpmm/optim/adamw.py
  • https://github.com/thu-ml/low-bit-optimizers/blob/main/lpmm/optim/sgd.py

LiutongZhou avatar Sep 17 '23 14:09 LiutongZhou

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

github-actions[bot] avatar Dec 20 '23 15:12 github-actions[bot]

raise

LiutongZhou avatar Jan 02 '24 21:01 LiutongZhou

Raise,

edit: I am currently no where near good enough at programming to do this, but it would be pretty cool to use some of the bnb paged 8bit adamw code, and 4bit adamw code and make a 4bit paged adamw. It would lower training requirements even lower gb cards, than a current implementation.

It could be possible to full fine tune a 7b model with 4bit optimizer, with a 24 gig card. With gradient accumulation, Based off this chart. IMG_3013

NicolasMejiaPetit avatar Feb 19 '24 02:02 NicolasMejiaPetit

If there happens to be a branch, or PR for this, I’d love to see it! Could you share a link?

NicolasMejiaPetit avatar Feb 22 '24 00:02 NicolasMejiaPetit