fp674018495
fp674018495
l think you can make " loss_data = loss_var.data[0] " as "loss_data = float(loss_var)" ;
I have the same issue with https://huggingface.co/Qwen/Qwen3-235B-A22B-GPTQ-Int4 using TP=8. l print size_k and BLOCK_SIZE_K ,size_k =192 , BLOCK_SIZE_K =128 ,so what is need change ?
> 对于因张量并行度过高导致的错误,可以设置 --enable-expert-parallel 参数。 参考:[#17327](https://github.com/vllm-project/vllm/issues/17327) Deploying a model with such settings might reduce the inference efficienc