Cola Chan

Results 17 comments of Cola Chan

I cannot remember, but I did solve it. It is nothing about VLLM or SGlang. I remember that our server was under maintenance at that time. Once they finished the...

> I finally found that it is a cluster communication bug, you can test it with vllm for inference on multimachine. If it works, then the cluster communication is ok....

这个问题我自己这边的解决方法是,首先检查fp16的精度问题,然后我升级了cuda为121,transformers从3.34升级到了3.37,然后就好了。

> 不兼容,我也遇到了,必须卸载才能用 我也是遇到了这个问题,卸载deepspeed后解决了,甚至是vllm和triton的兼容性问题也是这样解决的,属实逆天