wang tianyi
Results
1
issues of
wang tianyi
**Describe the bug** 通过下面的代码部署vllm server CUDA_VISIBLE_DEVICES=4,6 swift deploy \ --model "/hdd/wangty/model/Qwen2.5-VL-32B-Instruct-AWQ" \ --infer_backend vllm \ --served_model_name Qwen2.5-VL-32B-Instruct-AWQ \ --vllm_max_model_len 8192 \ --vllm_gpu_memory_utilization 0.9 \ --vllm_tensor_parallel_size 2 \ --vllm_enforce_eager true \...