ColossalAI
ColossalAI copied to clipboard
【Question】What is the minimum number of GPUs required to train deepseek 671B with GRPO? How about using LoRA?
https://company.hpc-ai.com/blog/shocking-release-deepseek-671b-fine-tuning-guide-revealed-unlock-the-upgraded-deepseek-suite-with-one-click-ai-players-ecstatic
The above article only provides the GPU requirements for SFT LoRA. What about GRPO?
Same question, and then if I set --max_length from 256 to 128, and --batch_size from 24 to 12, does this reduce fine-tuning memory consumption?