[minillm] does qwen2 support model parallelism?

Open 01yingxuan opened this issue 7 months ago • 3 comments

When performing SFT on the Qwen2.5 32B model, I encountered an OOM (Out of Memory) issue. I would like to ask if the Qwen series of models supports model parallelism during SFT?

May 29 '25 07:05 01yingxuan

Yes. You can use Model Parallelism just like other models by setting

OPTS+=" --model-parallel"
OPTS+=" --model-parallel-size ${MP_SIZE}"

May 29 '25 13:05 t1101675

Yes. You can use Model Parallelism just like other models by setting

OPTS+=" --model-parallel" OPTS+=" --model-parallel-size ${MP_SIZE}"

Thank you! I have solved the problem. I would also like to ask: when using Qwen models for supervised fine-tuning (SFT) on my custom dataset, the validation loss first decreases and then increases, and both the teacher model and student model exhibit repetitive generation patterns. How did you resolve this issue?

May 30 '25 06:05 01yingxuan

You can increase the temperature or use a repetition penalty in generation

May 30 '25 15:05 t1101675