ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[shardformer] support gradient accumulation for hybrid parallel plugin

Open Fridge003 opened this issue 2 years ago • 5 comments

support gradient accumulation for hybrid parallel plugin (through implementing no_sync method for plugin)

relevant issue: #4776

Fridge003 avatar Oct 08 '23 06:10 Fridge003

Hi, any updates? I need this feature so bad.

ShinoharaHare avatar Dec 13 '23 10:12 ShinoharaHare

Or is it possible to enable it on HybridParallelPlugin in a torch-like way (described in the document)? However, unlike GeminiPlugin, it seems there is no enable_gradient_accumulation for HybridParallelPlugin. It's confusing.

ShinoharaHare avatar Dec 13 '23 10:12 ShinoharaHare

Or is it possible to enable it on HybridParallelPlugin in a torch-like way (described in the document)? However, unlike GeminiPlugin, it seems there is no enable_gradient_accumulation for HybridParallelPlugin. It's confusing.

Hi, we will implement this feature as soon as possible.

flybird11111 avatar Dec 14 '23 03:12 flybird11111

Or is it possible to enable it on HybridParallelPlugin in a torch-like way (described in the document)? However, unlike GeminiPlugin, it seems there is no enable_gradient_accumulation for HybridParallelPlugin. It's confusing. Hi, https://colossalai.org/docs/features/gradient_accumulation_with_booster, You can use the gradient accumulation of the HybridParallelPlugin in the following way.

flybird11111 avatar Dec 22 '23 08:12 flybird11111

@flybird11111 Hi, I didn't find the enable_gradient_accumulation and no_sync() in HybridParallelPlugin https://github.com/hpcaitech/ColossalAI/blob/main/colossalai/booster/plugin/hybrid_parallel_plugin.py. So I wonder how to add gradient accumulation in HybridParallelPlugin following https://colossalai.org/docs/features/gradient_accumulation_with_booster. Can you provide more details?

cwszz avatar Dec 27 '23 03:12 cwszz