worker-vllm
worker-vllm copied to clipboard
Update documentation to note support for extra parameters
Greetings!
I just wanted to make a quick note that the documentation for worker-vllm and RunPod both don't seem to mention anything about vLLM supporting guided generation via Json schemas or Regex/grammar patterns, but it DOES in fact support it as vLLM itself supports it.
It's a great feature and more people should consider using it for sure. In case you're not familiar, check out the vLLM docs for details about the "extra" parameters on the OpenAI completions endpoints:
https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters-for-chat-api