Jee Jee Li
Jee Jee Li
> After reading the conversation [here](https://github.com/vllm-project/vllm/issues/6126#issuecomment-2208852102), it sounds like we would also need to set this env variable accordingly when using Triton punica kernel (e.g., once we merge [this](https://github.com/vllm-project/vllm/pull/5036) PR)....
Have you tried v0? ```bash VLLM_USE_V1=0 vllm serve .... ```
Similar issue: https://github.com/vllm-project/vllm/issues/17392
I'll try to reproduce and address this issue.
Can you try #17370 , it should fix this issue
Could you try https://github.com/vllm-project/vllm/pull/17435? Please rebuild from source
If torch 2.7.0 is used, this problem should not be encountered
See: https://docs.vllm.ai/en/latest/features/lora.html#dynamically-serving-lora-adapters
This PR has been open for quite a while, and it seems no one is interested in this
@tdoublep Thank you for contribution. Considering that LoRA has many variants, we can probably only maintain and support some commonly used features. I'm not sure whether we should consider this...