Leoyed

Results 1 issues of Leoyed

### Your current environment vllm Version: 0.8.1 ### How would you like to use vllm 我使用vllm启动Qwen2.5-32B-Instruct-INT8-W8A16模型,但是总是提示显存不足,我用的是2张24G显存的RTX 3090,请问下是否启动有问题还是显存不足以启动模型? 启动指令: vllm serve Qwen2.5-32B-Instruct-INT8-W8A16 --tensor-parallel-size 2 --gpu-memory-utilization 0.9 --max-model-len 4096 --enforce-eager 报错内容: ERROR...

usage