bitsandbytes icon indicating copy to clipboard operation
bitsandbytes copied to clipboard

Initial kernel changes to support GaLore

Open matthewdouglas opened this issue 1 year ago • 7 comments

This is a draft containing some of the initial changes to support GaLore. So far this covers 2-state optimizers.

Optimizer2State.update_step() now contains an additional argument return_updates. When provided a tensor to hold the updates, they're returned here and p is not changed. Additionally, no weight decay is applied.

Needs tests, feedback welcome.

cc: @TimDettmers @jiaweizzhao @Titus-von-Koeller

matthewdouglas avatar Mar 18 '24 23:03 matthewdouglas

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

github-actions[bot] avatar Mar 18 '24 23:03 github-actions[bot]

@matthewdouglas Tim said he could review your work this weekend.

Titus-von-Koeller avatar Apr 05 '24 10:04 Titus-von-Koeller

Updated with changes added for 1-state optimizers (Momentum, RMSProp, Adagrad, Lion).

matthewdouglas avatar Apr 07 '24 13:04 matthewdouglas