Lu Fang
Lu Fang
The root cause is actually cmake 3.26.0, upgrading to 3.26.1 or newer version should solve the problem.
Same as https://github.com/vllm-project/vllm/issues/18748
Try --dtype float32?
Sorry about this issue. Currently we are looking into the LoRA support for Llama4. cc: @frank-wei
Yes, please review.
format, also wondering how do we test here?
You can try to remove the torch.compile cache and see if it causes any difference. Or try VLLM_DISABLE_COMPILE_CACHE=1 to disable torch compile cache. Likely it's not due torch.compile, but another...
Could you check if the failed test is related or not? Like if fails without PR or not locally?
What's the new wheel size? :-)
We will try to pick this up.