ColossalAI [FEATURE]: Integrate GaLore into Colossalai Optimizer(Gemini/Hybrid)

[FEATURE]: Integrate GaLore into Colossalai Optimizer(Gemini/Hybrid)

Open ericxsun opened this issue 11 months ago • 5 comments

Describe the feature

A recent paper titled "GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection" (https://arxiv.org/pdf/2403.03507.pdf) demonstrates a remarkable memory-efficient approach during the training of large language models (LLMs).

Can we integrate this memory-efficient technique into the Colossalai framework?

FYI

GaLore Adamw: https://github.com/jiaweizzhao/GaLore/blob/master/galore_torch/adamw.py
8bit-GaLore Adamw: https://github.com/jiaweizzhao/GaLore/blob/master/galore_torch/adamw8bit.py

Mar 11 '24 03:03 ericxsun

Any ColossalAI-er could take a look?

Mar 27 '24 03:03 ericxsun

Thanks! We will take a look.

Mar 27 '24 03:03 ver217

I will take multiple looks

Mar 27 '24 10:03 Edenzzzz

I see the MR, that's awesome, when can we use it?

Apr 15 '24 06:04 ericxsun

I plan to release it next week

Apr 20 '24 05:04 Edenzzzz

ColossalAI ColossalAI copied to clipboard

[FEATURE]: Integrate GaLore into Colossalai Optimizer(Gemini/Hybrid)

Describe the feature

ColossalAI
ColossalAI copied to clipboard