vllm Please add lora support for higher ranks and alpha values

ValueError: LoRA rank 64 is greater than max_lora_rank 16.

Feb 13 '24 07:02 parikshitsaikia1619

Mark

Mar 06 '24 01:03 SuperBruceJia

Bump

Mar 08 '24 05:03 Peter-Devine

It's not super well documented but you need to just pass in "-max-lora-rank 64" or whatever when serving since default is 16.

python -m vllm.entrypoints.openai.api_server --max-lora-rank 64
--model model_name
--enable-lora
--lora-modules lora-name=lora_path

Mar 16 '24 21:03 dspoka

It's not super well documented but you need to just pass in "-max-lora-rank 64" or whatever when serving since default is 16.

python -m vllm.entrypoints.openai.api_server --max-lora-rank 64 --model model_name --enable-lora --lora-modules lora-name=lora_path

Thanks for the answer, it helped me as well. For those who use code, it would be here:

          llm = LLM(
            model=args.model, tensor_parallel_size=torch.cuda.device_count(), 
            dtype=args.dtype, trust_remote_code=True, enable_lora=True, max_lora_rank=64
        )

Mar 20 '24 15:03 spreadingmind

Both answers work for me, up to rank 64. Rank > 64 is not supported yet.

See #3934

Apr 10 '24 07:04 Napuh

Can we get Lora rank > 64 supported and merged?

edit: im also curious if this was by design to support up to 64 rank, if so please let me know

Jun 09 '24 04:06 patrickrho

Bump. I need adapters that are much, much larger to be supported. Thanks

Jun 11 '24 18:06 kevinjesse

Is there something special about lora rank >64. Wonder why only <=64 is supported

Jun 29 '24 05:06 jiangjin1999

same here, this is a blocker for me

Aug 02 '24 06:08 JohnUiterwyk

@JohnUiterwyk , has this not been fixed by the suggestions from @dspoka and @spreadingmind ? Their suggestions worked for me.

Aug 04 '24 08:08 Peter-Devine

No, as the maximum max_lora_rank is 64, going higher than that throws an error. I have adapters with rank 128 and 256 for certain uses cases, and can not serve them with vllm as a result of the hardcoded limit for the allowed value passed to max_lora_rank

Sep 02 '24 09:09 JohnUiterwyk

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

Dec 02 '24 02:12 github-actions[bot]

Any updates on this? Recent papers showed that a rank=256 seems to be very beneficial for example. I suspect this trend will continue to be the case and increasing in the near future.

Dec 09 '24 17:12 AntreasAntonio

Any updates on this? Recent papers showed that a rank=256 seems to be very beneficial for example. I suspect this trend will continue to be the case and increasing in the near future.

interesting, do you know the paper link @AntreasAntonio

Mar 03 '25 02:03 Jiaxin-Wen

wanted to update: LORA rank 256 works with vllm 0.8.2

Apr 01 '25 00:04 Vaibhav-Sahai

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

Jun 30 '25 02:06 github-actions[bot]

This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you!

Jul 30 '25 02:07 github-actions[bot]