llama-stack Support guided decoding with vllm and remote::vllm

Support guided decoding with vllm and remote::vllm

Open ashwinb opened this issue 11 months ago • 5 comments

🚀 The feature, motivation and pitch

(fireworks, together, meta-reference) support guided decoding (specifying a json-schema for example, as a "grammar" for decoding) with inference. vLLM supports this functionality -- enable that in the API

Alternatives

No alternatives, this is a core feature that must be supported by all providers (as far as possible).

Additional context

No response

Nov 07 '24 06:11 ashwinb

llama-stack llama-stack copied to clipboard

Support guided decoding with vllm and remote::vllm

🚀 The feature, motivation and pitch

Alternatives

Additional context

llama-stack
llama-stack copied to clipboard