text-generation-inference Allow specifying `adapter_id` on `chat/completions` requests

Allow specifying `adapter_id` on `chat/completions` requests

Open tsvisab opened this issue 9 months ago • 4 comments

trafficstars

Feature request

It seems that if i want to load a base model with an adapter and consume it, i'll have to use the generate route only which allows specifying adapter_id

`curl 127.0.0.1:3000/generate
-X POST
-H 'Content-Type: application/json'
-d '{ "inputs": "Was "The office" the funniest tv series ever?", "parameters": { "max_new_tokens": 200, "adapter_id": "tv_knowledge_id" } }'

but can't use v1/chat/completions

are you planing to support this?

Motivation

Many use v1/chat/completions and train lora adapters for it

Your contribution

Maybe, if you're over your capacity

Jan 22 '25 11:01 tsvisab

text-generation-inference text-generation-inference copied to clipboard

Allow specifying `adapter_id` on `chat/completions` requests

Feature request

Motivation

Your contribution

text-generation-inference
text-generation-inference copied to clipboard