Jicheng Li comments

Repositories
Issues
Comments

Results 4 comments of


                                            Jicheng Li

RuntimeError: "nll_loss_out_frame" not implemented for 'Half'

> 去掉--fp16, 设置 --torch_dtype auto 感谢回复，操作后依然报错

Qwen3-VL-235B-A22B SFT OOM issue

> > I'm using 8×8 A100 GPUs to fine-tune (SFT) the Qwen3-VL-235B-A22B-Instruct model, but I keep encountering out-of-memory (OOM) issues regardless of the settings. Could you please advise me on...

Qwen3-VL-235B-A22B SFT OOM issue

> > > > I'm using 8×8 A100 GPUs to fine-tune (SFT) the Qwen3-VL-235B-A22B-Instruct model, but I keep encountering out-of-memory (OOM) issues regardless of the settings. Could you please advise...

Qwen3-VL-235B-A22B SFT OOM issue

> try use tp4ep8pp8 or tp4ep8pp4 Thanks for your reply. Based on your configuration, the total GPU requirement should be tp × ep × pp = 256 or 128 GPUs,...