vllm icon indicating copy to clipboard operation
vllm copied to clipboard

Please add lora support for higher ranks and alpha values

Open parikshitsaikia1619 opened this issue 1 year ago • 14 comments

ValueError: LoRA rank 64 is greater than max_lora_rank 16.

parikshitsaikia1619 avatar Feb 13 '24 07:02 parikshitsaikia1619

Mark

SuperBruceJia avatar Mar 06 '24 01:03 SuperBruceJia

Bump

Peter-Devine avatar Mar 08 '24 05:03 Peter-Devine

It's not super well documented but you need to just pass in "-max-lora-rank 64" or whatever when serving since default is 16.

python -m vllm.entrypoints.openai.api_server --max-lora-rank 64
--model model_name
--enable-lora
--lora-modules lora-name=lora_path

dspoka avatar Mar 16 '24 21:03 dspoka

It's not super well documented but you need to just pass in "-max-lora-rank 64" or whatever when serving since default is 16.

python -m vllm.entrypoints.openai.api_server --max-lora-rank 64 --model model_name --enable-lora --lora-modules lora-name=lora_path

Thanks for the answer, it helped me as well. For those who use code, it would be here:

          llm = LLM(
            model=args.model, tensor_parallel_size=torch.cuda.device_count(), 
            dtype=args.dtype, trust_remote_code=True, enable_lora=True, max_lora_rank=64
        )

spreadingmind avatar Mar 20 '24 15:03 spreadingmind

Both answers work for me, up to rank 64. Rank > 64 is not supported yet.

See #3934

Napuh avatar Apr 10 '24 07:04 Napuh

Can we get Lora rank > 64 supported and merged?

edit: im also curious if this was by design to support up to 64 rank, if so please let me know

patrickrho avatar Jun 09 '24 04:06 patrickrho

Bump. I need adapters that are much, much larger to be supported. Thanks

kevinjesse avatar Jun 11 '24 18:06 kevinjesse

Is there something special about lora rank >64. Wonder why only <=64 is supported

jiangjin1999 avatar Jun 29 '24 05:06 jiangjin1999

same here, this is a blocker for me

JohnUiterwyk avatar Aug 02 '24 06:08 JohnUiterwyk

@JohnUiterwyk , has this not been fixed by the suggestions from @dspoka and @spreadingmind ? Their suggestions worked for me.

Peter-Devine avatar Aug 04 '24 08:08 Peter-Devine

No, as the maximum max_lora_rank is 64, going higher than that throws an error. I have adapters with rank 128 and 256 for certain uses cases, and can not serve them with vllm as a result of the hardcoded limit for the allowed value passed to max_lora_rank

JohnUiterwyk avatar Sep 02 '24 09:09 JohnUiterwyk

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

github-actions[bot] avatar Dec 02 '24 02:12 github-actions[bot]

Any updates on this? Recent papers showed that a rank=256 seems to be very beneficial for example. I suspect this trend will continue to be the case and increasing in the near future.

AntreasAntonio avatar Dec 09 '24 17:12 AntreasAntonio

Any updates on this? Recent papers showed that a rank=256 seems to be very beneficial for example. I suspect this trend will continue to be the case and increasing in the near future.

interesting, do you know the paper link @AntreasAntonio

Jiaxin-Wen avatar Mar 03 '25 02:03 Jiaxin-Wen

wanted to update: LORA rank 256 works with vllm 0.8.2

Vaibhav-Sahai avatar Apr 01 '25 00:04 Vaibhav-Sahai

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

github-actions[bot] avatar Jun 30 '25 02:06 github-actions[bot]

This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you!

github-actions[bot] avatar Jul 30 '25 02:07 github-actions[bot]