inference 如何部署多副本的语音模型

System Info / 系統信息

Linux centos 2张A40 GPU卡

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

[x] docker / docker
[ ] pip install / 通过 pip install 安装
[ ] installation from source / 从源码安装

Version info / 版本信息

1.3.1.post

The command used to start Xinference / 用以启动 xinference 的命令

nerdctl run -d --name xinference --network xinference-net -p 127.0.0.1:9997:9997 -v /data/voice-model:/root/.cache/modelscope -e LD_LIBRARY_PATH=/opt/conda/envs/ffmpeg-env/lib --gpus all xprobe/xinference:latest xinference-local -H 0.0.0.0

Reproduction / 复现过程

我现在有两张卡，想部署超过2的一个副本数，我应该如何启动模型呢？不然只能启动2个副本，还有很多空余的资源没有用到。输入大于2副本数，会报错

Expected behavior / 期待表现

希望可以将多个副本部署到一张卡上，将其他资源也利用起来。

Apr 16 '25 00:04 jiliangqian

This issue is stale because it has been open for 7 days with no activity.

Apr 23 '25 19:04 github-actions[bot]

This issue was closed because it has been inactive for 5 days since being marked as stale.

Apr 29 '25 19:04 github-actions[bot]