inference VllM ： NCCLerror: unhandled system error (run withNCCL DEBUG=INFo for details)

System Info / 系統信息

centos x84_64 、xinference:1.21 、cuda 12.4 、GPU: P40*4

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

[x] docker / docker
[ ] pip install / 通过 pip install 安装
[ ] installation from source / 从源码安装

Version info / 版本信息

xinference:1.21

The command used to start Xinference / 用以启动 xinference 的命令

xinference launch --model-engine vllm --model-name deepseek-r1-distil1-gwen --model path /data/models/DeepSeek-R1-Distil1-Qwen-328 --n-gpu 4 --size-in-billions 32 --model-format pytorch --max_model_len 4096 --dtype half

Reproduction / 复现过程

Expected behavior / 期待表现

I'm not sure if this is a system level error or if it's caused by an incompatibility of vllm。

Feb 10 '25 13:02 amzfc

Did you use docker?

Feb 11 '25 02:02 qinxuye

yes ，i use the docker.

Feb 11 '25 02:02 amzfc

yes ，i use the docker.

Added --shm-size=128g when docker run.

Feb 11 '25 06:02 qinxuye

yes ，i use the docker.

Added --shm-size=128g when docker run.

It worked for me, thanks.

Mar 13 '25 02:03 IdleIdiot