ColossalAI
ColossalAI copied to clipboard
[FEATURE]: add master_weights arg to HybridParallelPlugin
Describe the feature
When using CPU offload, setting master_weights=False in both GeminiPlugin and LowLevelZeroPlugin can reduce GPU memory usage and improve speed. Does HybridParallelPlugin also support this feature?
Zero Optimizer usually updates parameters using float32. Not using float32 may lead to unstable training.