DeepSpeedExamples
DeepSpeedExamples copied to clipboard
the memory usage of zero3 is larger than zero1
When I run the step1_supervised_finetuning script, I find that the memory usage of zero3 is larger than that of zero1, which seems unreasonable. Is there any other optimization here?
@blldd could you provide more details, like the training scripts, GPU numbers etc?
Close the issue since there is no followup. Please reopen it if necessary