Daniel Spokoyny

Results 1 comments of Daniel Spokoyny

It's not super well documented but you need to just pass in "-max-lora-rank 64" or whatever when serving since default is 16. python -m vllm.entrypoints.openai.api_server --max-lora-rank 64 \ --model model_name\...