Guided decoding backends not working - both old and new APIs fail
Neither the old guided decoding API (guided_json) nor the new structured outputs API (structured_outputs) work in this worker. JSON schema constraints are ignored with the old API, and the new API throws parameter errors.
Issues:
-
Old API (
guided_json): Schema constraints are completely ignored - model follows contradictory system prompts instead of enforced JSON -
New API (
structured_outputs): Throws "Unexpected keyword argument 'extra_body'" error -
Both backends affected:
outlinesandlm-format-enforcerboth fail to enforce constraints
Steps to reproduce:
Test 1 - Old API failure:
{
"input": {
"messages": [
{
"role": "system",
"content": "OUTPUT YAML ONLY: reasoning: |\n your thoughts\nquestion: your question"
},
{
"role": "user",
"content": "Analyze: The Eiffel Tower was completed in 1889 for the World's Fair."
}
],
"sampling_params": {
"max_tokens": 500,
"temperature": 0.1
},
"guided_json": {
"type": "object",
"properties": {
"reasoning": {"type": "string"},
"question": {"type": "string"}
},
"required": ["reasoning", "question"]
},
"guided_decoding_backend": "outlines"
}
}
Result: Model outputs YAML, completely ignoring JSON schema constraints.
Test 2 - New API failure:
{
"input": {
"messages": [
{"role": "user", "content": "Analyze: Some text"}
],
"sampling_params": {
"max_tokens": 500,
"temperature": 0.1,
"extra_body": {
"structured_outputs": {
"json": {
"type": "object",
"properties": {
"reasoning": {"type": "string"},
"question": {"type": "string"}
},
"required": ["reasoning", "question"]
}
}
}
}
}
}
Result: "Unexpected keyword argument 'extra_body'" error.
Environment:
- Worker: vllm v2.9.6 (vLLM 0.11.0)
- GUIDED_DECODING_BACKEND environment variable set to
outlines
Possible cause:
The worker was updated to vLLM 0.11.0 but may not have been updated to handle the API changes. vLLM 0.11.0 introduced the new structured_outputs API while deprecating the old guided decoding fields, but the worker's custom wrapper doesn't support the new API and the old API may no longer function properly.
Both guided decoding approaches are currently unusable in this worker version.
Maybe this is actually the reason for https://github.com/runpod-workers/worker-vllm/issues/233
I deployed worker-vllm v2.9.4, which uses the vLLM library v0.10.0. My goal was to test the functionality of the old guided_json API before the introduction of the structured_outputs API in vLLM 0.11.0.
Results:
-
structured_outputsAPI (Test 1): As expected for vLLM 0.10.0, this new API failed with the error "Unexpected keyword argument 'extra_body'". This confirms the API is not available in this version of the underlying library. -
guided_jsonAPI (Tests 2 & 3): The API call succeeded, but the core functionality was still broken. The model output was YAML (as instructed by the system prompt) and did not conform to the JSON schema specified in theguided_jsonparameter. The schema constraints were ignored.
Conclusion:
Downgrading to worker-vllm v2.9.4 (vLLM 0.10.0) did not fix the issue with guided decoding. The guided_json API, which should have been the correct and supported method in this version, failed to enforce the specified JSON schema. The output format was determined by the conflicting system prompt rather than the guided_json constraint. Therefore, the core problem existed in this older worker version as well.
Test 1:
{
"input": {
"messages": [
{
"role": "system",
"content": "OUTPUT YAML ONLY: reasoning: |\n your thoughts\nquestion: your question"
},
{
"role": "user",
"content": "Analyze: The Eiffel Tower was completed in 1889 for the World's Fair."
}
],
"sampling_params": {
"max_tokens": 500,
"temperature": 0.1,
"extra_body": {
"structured_outputs": {
"json": {
"type": "object",
"properties": {
"reasoning": {"type": "string"},
"question": {"type": "string"}
},
"required": ["reasoning", "question"]
}
}
}
}
}
}
Test 2:
{
"input": {
"messages": [
{
"role": "system",
"content": "OUTPUT YAML ONLY: reasoning: |\n your thoughts\nquestion: your question"
},
{
"role": "user",
"content": "Analyze: The Eiffel Tower was completed in 1889 for the World's Fair."
}
],
"sampling_params": {
"max_tokens": 500,
"temperature": 0.1
},
"guided_json": {
"type": "object",
"properties": {
"reasoning": {"type": "string"},
"question": {"type": "string"}
},
"required": ["reasoning", "question"]
},
"guided_decoding_backend": "outlines"
}
}
Test 3:
{
"input": {
"messages": [
{
"role": "system",
"content": "OUTPUT YAML ONLY: reasoning: |\n your thoughts\nquestion: your question"
},
{
"role": "user",
"content": "Analyze: The Eiffel Tower was completed in 1889 for the World's Fair."
}
],
"sampling_params": {
"max_tokens": 500,
"temperature": 0.1
},
"guided_json": {
"type": "object",
"properties": {
"reasoning": {"type": "string"},
"question": {"type": "string"}
},
"required": ["reasoning", "question"]
},
"guided_decoding_backend": "lm-format-enforcer"
}
}
Maybe this is actually the reason for #233
No, I got Structured Outputs working now, but it still doesn't enforce order, whether using
GUIDED_DECODING_BACKEND=outlines
or
GUIDED_DECODING_BACKEND=lm-format-enforcer
LMFE_STRICT_JSON_FIELD_ORDER=true