llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

Support guided decoding with vllm and remote::vllm

Open ashwinb opened this issue 11 months ago • 5 comments

🚀 The feature, motivation and pitch

(fireworks, together, meta-reference) support guided decoding (specifying a json-schema for example, as a "grammar" for decoding) with inference. vLLM supports this functionality -- enable that in the API

Alternatives

No alternatives, this is a core feature that must be supported by all providers (as far as possible).

Additional context

No response

ashwinb avatar Nov 07 '24 06:11 ashwinb