vllm
vllm copied to clipboard
[Bug]: 启动之后 用了一段时间 显存越占越多
Your current environment
2*A100 配置 启动项 python -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 7864 --max-model-len 8000 --served-model-name chat-v2.0 --model /workspace/sdata/checkpoint-140-merged --enforce-eager --tensor-parallel-size 2 --gpu-memory-utilization 0.95
Model Input Dumps
🐛 Describe the bug
启动后 使用一段时间 显存越占越大 最后会崩掉
Before submitting a new issue...
- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.