llama-stack
llama-stack copied to clipboard

Published 20 hours ago •

Reame
Issues

Support "stop" parameter in inference providers

Open terrytangyuan opened this issue 7 months ago • 2 comments

🚀 Describe the new functionality needed

"stop" parameter: https://platform.openai.com/docs/api-reference/completions/create#completions-create-stop

💡 Why is this needed? What if we don't build it?

Only vLLM inference provider is supported/tested through https://github.com/meta-llama/llama-stack/pull/1715

Other thoughts

No response

Mar 24 '25 17:03 terrytangyuan