xuyuedream

Results 1 comments of xuyuedream

close gnome. vllm serve Qwen/Qwen2.5-32B-Instruct-GPTQ-Int4 --dtype=float16 --gpu-memory-utilization 0.99 --max-model-len=4096 --enforce_eager