vllm [Frontend] support new lora module to a live server in OpenAI Entrypoints

[Frontend] support new lora module to a live server in OpenAI Entrypoints

Open AlphaINF opened this issue 1 year ago • 3 comments

Previous version of OpenAI entrypoints didn't support adding lora adapter to a live server. Now you can use my version to add an adapter path by some command:

curl -X GET your_host:your_port/add_lora \
    -H "Content-Type: application/json" \
    -d '{
        "lora_name": "your_new_lora_model_name",
        "lora_local_path": "your_new_lora_model_path"
    }'

After adding the model, you can use these model like another exist loras.

Mar 16 '24 09:03 AlphaINF

We will review #3308 as it also have a delete API for completeness.

Apr 18 '24 07:04 simon-mo

Could you please tell me when the online service feature for adding LoRA weights will be officially integrated?

Jul 08 '24 11:07 TangJiakai

We had a different version with more test and better coverage (chat/completion/embedding). Let me rebase to master and share with the community. @TangJiakai @simon-mo

Jul 09 '24 00:07 Jeffwan

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

Oct 29 '24 02:10 github-actions[bot]

This pull request has merge conflicts that must be resolved before it can be merged. @AlphaINF please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Oct 29 '24 02:10 mergify[bot]

vllm vllm copied to clipboard

[Frontend] support new lora module to a live server in OpenAI Entrypoints

vllm
vllm copied to clipboard