vllm icon indicating copy to clipboard operation
vllm copied to clipboard

Disable cuda version check in vllm-openai image

Open zhaoyang-star opened this issue 9 months ago • 2 comments

Fix #4521

Currently we no need to check cuda version when using fp8 kv cache. As of now, vLLM's binaries are compiled with CUDA 12.1 and public PyTorch release versions by default. The vllm-openai image has also CUDA 12.1.

zhaoyang-star avatar May 01 '24 15:05 zhaoyang-star

sorry i just merged the other PR, can you resolve the conflict?

simon-mo avatar May 01 '24 16:05 simon-mo

🤦‍♂️ sorry another conflict

simon-mo avatar May 02 '24 18:05 simon-mo

@simon-mo The conflict is solved. Please take a review.

zhaoyang-star avatar May 05 '24 08:05 zhaoyang-star