kang
kang
Could you give me a rough estimate of when that might be?
Thanks for sharing:)
I experienced the same issue and it seems to be the issue with the order of installation. In my case, I removed `flash-attn` from requirements.txt and ran `pip install -r...
I'm serving with 8 a100(80gb) GPUs, and max_token_len=16384. When I put in 5 images in chat template, the vllm server dies. Sometimes it doesn't die right away, but after a...