inference
inference copied to clipboard
xinference并发处理问题
Describe the bug
我用fastGPT接入了xinference部署的vllm qwen32b,测试并发的时候会遇到跑4个并发的时候xinference后台报错,然后ui里也看不到跑的模型了,显卡还在100%占用
To Reproduce
To help us to reproduce this bug, please provide information below:
- Your Python version.3.10
- The version of xinference you use.0.11.0
- Versions of crucial packages.
- Full stack of the error.
- Minimized code to reproduce the error.
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.
vllm 0.4.1