inference 0.15.2版本xinference启动本地模型报错Model not found

0.15.2版本xinference启动本地模型报错Model not found

Open wujingbo-web opened this issue 5 months ago • 2 comments

System Info / 系統信息

宿主机：cent7.9 cuda:12.4 python3.10

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

[X] docker / docker
[ ] pip install / 通过 pip install 安装
[ ] installation from source / 从源码安装

Version info / 版本信息

0.15.2

The command used to start Xinference / 用以启动 xinference 的命令

xinference launch --model_path /data1/model/Qwen2.5-72B-Instruct-AWQ --model-engine vllm -n Qwen2.5-72B-Instruct-AWQ -s 72 -f awq -q int4

Reproduction / 复现过程

1。手工下载模型权重到本地 2。启动docker，进入docker内部执行命令： xinference launch --model_path /data1/model/Qwen2.5-72B-Instruct-AWQ --model-engine vllm -n Qwen2.5-72B-Instruct-AWQ -s 72 -f awq -q int4 3、 Launch model name: Qwen2.5-72B-Instruct-AWQ with kwargs: {'model_path': '/data1/model/Qwen2.5-72B-Instruct-AWQ'} Traceback (most recent call last): File "/usr/local/bin/xinference", line 8, in sys.exit(cli()) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in call return self.main(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/click/decorators.py", line 33, in new_func return f(get_current_context(), *args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/xinference/deploy/cmdline.py", line 901, in model_launch model_uid = client.launch_model( File "/usr/local/lib/python3.10/dist-packages/xinference/client/restful/restful_client.py", line 940, in launch_model raise RuntimeError( RuntimeError: Failed to launch model, detail: [address=0.0.0.0:15216, pid=108] Model not found, name: Qwen2.5-72B-Instruct-AWQ, format: awq, size: 72, quantization: int4

Expected behavior / 期待表现

期待能正常运行。目前通过ui界面启动模型运行是正常的，自己下载后就不能启动模型。

Sep 24 '24 07:09 wujingbo-web

inference inference copied to clipboard

0.15.2版本xinference启动本地模型报错Model not found

System Info / 系統信息

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

Version info / 版本信息

The command used to start Xinference / 用以启动 xinference 的命令

Reproduction / 复现过程

Expected behavior / 期待表现

inference
inference copied to clipboard