litgpt icon indicating copy to clipboard operation
litgpt copied to clipboard

Add support for memory-efficient and faster optimizers

Open rasbt opened this issue 1 year ago • 1 comments

Maybe GaLore (#1192) should be changed from GaloreArgs to OptimizerArgs after all. Then we can also more easily consider other variants such as BAdam (BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models, https://arxiv.org/abs/2404.02827).

The experiments from here look very compelling. And it only adds 1 hyperparameter:

Screenshot 2024-04-27 at 8 36 56 AM

rasbt avatar Apr 27 '24 13:04 rasbt

Agreed

lantiga avatar Apr 29 '24 18:04 lantiga