beta.chat.completions.parse returns unhandled ValidationError
Confirm this is an issue with the Python library and not an underlying OpenAI API
- [X] This is an issue with the Python library
Describe the bug
In some occasions while using the Completion API with Structured Outputs, the SDK fails and returns a ValidationError:
ValidationError: 1 validation error for RawResponse
Invalid JSON: EOF while parsing a value at line 1 column 600 [type=json_invalid, input_value=' ... ', input_type=str]
For further information visit https://errors.pydantic.dev/2.9/v/json_invalid
This does not happen every time, but we use it in a production service and this unpredictable behavior is hard to prevent.
To Reproduce
- Create a Pydantic model
- Instantiate an OpenAI client
- Use the method
OpenAI.beta.chat.completions.parse(...)with the following arguments - Repeat a few times for seeing the error
from pydantic import BaseModel
from openai import OpenAI
class RawResponse(BaseModel):
answer: str
client = OpenAI(api_key=...)
completion = client.beta.chat.completions.parse(
model='gpt-4o-2024-08-06',
messages=messages,
max_tokens=750,
n=1,
stop=None,
temperature=0.1,
response_format=RawResponse
)
After a few times, this fails with:
ValidationError: 1 validation error for RawResponse
Invalid JSON: EOF while parsing a value at line 1 column 600 [type=json_invalid, input_value=' ... ', input_type=str]
For further information visit https://errors.pydantic.dev/2.9/v/json_invalid
Code snippets
No response
OS
debian:bullseye-slim
Python version
CPython 3.10.8
Library version
openai 1.48.0
Thanks for the report, it looks like your example script isn't fully complete, could you share a full script?
Hi, thanks for the quick reply! Sadly I can't provide a full script for privacy reasons but I'll be happy to share any information you need for identifying the issue. Here's the traceback:
File "/app/src/core/modules/emiGPT/core/openai_chat_api.py", line 95, in _get_response
completion = self._client.beta.chat.completions.parse(
File "/opt/venv/lib/python3.10/site-packages/openai/resources/beta/chat/completions.py", line 145, in parse
return _parse_chat_completion(
File "/opt/venv/lib/python3.10/site-packages/openai/lib/_parsing/_completions.py", line 110, in parse_chat_completion
"parsed": maybe_parse_content(
File "/opt/venv/lib/python3.10/site-packages/openai/lib/_parsing/_completions.py", line 161, in maybe_parse_content
return _parse_content(response_format, message.content)
File "/opt/venv/lib/python3.10/site-packages/openai/lib/_parsing/_completions.py", line 221, in _parse_content
return cast(ResponseFormatT, model_parse_json(response_format, content))
File "/opt/venv/lib/python3.10/site-packages/openai/_compat.py", line 166, in model_parse_json
return model.model_validate_json(data)
File "/opt/venv/lib/python3.10/site-packages/pydantic/main.py", line 625, in model_validate_json
return cls.__pydantic_validator__.validate_json(json_data, strict=strict, context=context)
pydantic_core._pydantic_core.ValidationError: 1 validation error for RawResponse
Invalid JSON: EOF while parsing a value at line 1 column 600 [type=json_invalid, input_value=' ... ', input_type=str]
For further information visit https://errors.pydantic.dev/2.9/v/json_invalid
Please let me know if there's anything else you need.
Could you share a request ID from a failing request? https://github.com/openai/openai-python#request-ids
From seeing the logs of our application I understand the call to client.beta.chat.completions.parse(...) resulted in an exception, thus it gave no result from which to extract a request_id. 😞
ahhhh right sorry, if you don't already have debug logging enabled, could you enable it? https://github.com/openai/openai-python#logging that should show a request ID in the logs
Sure thing!
Hi there! Any updates here? FYI - same thing happening for me, probably 50% of the time:
openai==1.53.0
pydantic==2.9.2
pydantic_core==2.23.4
I've solved this in the meantime with a tenacity retry, but it's adding latency and calls which isn't ideal...
try:
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=0, min=0, max=0), # No wait between retries
retry=retry_if_exception_type(ValidationError),
before=lambda retry_state: messages.append({
"role": "system",
"content": "Raised Exception: pydantic_core._pydantic_core.ValidationError. Please try again and confirm to model specs."
})
)
async def attempt_parse():
return await async_client.beta.chat.completions.parse(
model=model,
messages=messages,
response_format=MyPydanticModel,
functions=FUNCTION_LIST,
function_call="auto",
)
response = await attempt_parse()
except ValidationError as e:
logger.error(f"Failed to parse response after all retries: {e}")
raise
I'm experiencing the same issue.
I'm considering consuming the API directly myself as this is a bit of a pain.
Is there an intention to address this bug?
How complex are the models you're putting together? I wonder if it's just a token limit or something?
e.g. here's a simple Pydantic model from a test script I was playing around with:
class UserInformation(BaseModel):
name: Optional[str] = Field(description="Name of the user")
email: Optional[str] = Field(description="Email address of the user")
phone: Optional[str] = Field(description="Phone number of the user. Store this in the format '+1 123-456-7890'")
title_role: Optional[str] = Field(description="The Title or Role of the user at their company")
company_name: Optional[str] = Field(description="Name of the company the user works for")
which explodes into:
{
"properties": {
"name": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"description": "Name of the user",
"title": "Name"
},
"email": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"description": "Email address of the user",
"title": "Email"
},
"phone": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"description": "Phone number of the user. Store this in the format +1 123-456-7890",
"title": "Phone"
},
"title_role": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"description": "The Title or Role of the user at their company",
"title": "Title Role"
},
"company_name": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"description": "Name of the company the user works for",
"title": "Company Name"
}
},
"required": [
"name",
"email",
"phone",
"title_role",
"company_name"
],
"title": "UserInformation",
"type": "object"
}
A fairly complex model feels like it would generate quite a few tokens which might get missed/misinterpreted. I wonder if a simplified json blob for the response_format would help things?
My schema is only slightly more complex than your example, very similar, but wrapped in an array allowing the model to return multiple entries for each prompt. Output token lengths are fairly reasonable, but I assume there must be pathological cases.
Did anybody find a fix to this issue? This issue only recently appear for me.
Thanks @RobertCraigie for merging the PR! @DeterjoSimon just pushed the fix, should be available in next release!
@rjoshi I don't think your PR would solve all the issues encountered, as the original report includes this in the error message
line 1 column 600
which means the content was non-empty
Thanks @RobertCraigie , I see - I have not yet run into this issue but will keep an eye out for this. We at Starspark.AI are betting heavily on structured response for enabling our product scenario.
We experience the same issue but with different model gpt-4o-mini-2024-07-18 deployed onto Azure OpenAI (Sweden Central).
Some diagnostic info:
# python -V
Python 3.11.8
# pip list | grep openai
langchain-openai 0.3.0
openai 1.59.7
Here is our structure output model:
from typing import List, TypeAlias, Union
from langchain_core.documents.base import Document
from pydantic import BaseModel, Field
# Type aliases for improved readability
CitationList: TypeAlias = List["Citation"]
DocumentList: TypeAlias = List[Document]
class Citation(BaseModel):
"""
Citation from a specific document that justifies an answer.
Note:
A chat model must include these document attributes in the context of the
prompt to return a structured output.
Attributes:
document_id (int): The integer ID of a specific document which justifies
the answer.
quote (str): The verbatim quote from the specified source that justifies
the answer.
title (str): The title of the document that contains the quote.
metadata_storage_name (str): The filename of the document, including its
extension, in the storage system.
metadata_storage_path (str): The path to the document in the storage
system.
source_url (str): The URL of the source document that contains
the quote.
"""
document_id: int = Field(
...,
description=(
"The integer ID of a SPECIFIC document which justifies the answer."
),
)
quote: str = Field(
...,
description=(
"The VERBATIM quote from the specified source that justifies the answer."
),
)
title: str = Field(
...,
description="The title of the document that contains the quote.",
)
metadata_storage_name: str = Field(
...,
description=(
"The filename of the document, including its extension, in the storage"
" system."
),
)
source_url: str | None = Field(
None,
description="The URL of the source document that contains the quote.",
)
class QuotedAnswer(BaseModel):
"""Answer the user question based only on the given sources, and cite the
sources used."""
answer: str = Field(
...,
description=(
"The answer to the user question, which is based only on the given"
" sources."
),
)
citations: CitationList = Field(
...,
description="Citations from the given sources that justify the answer.",
)
class ConversationalResponse(BaseModel):
"""Respond in a conversational manner. Be kind and helpful."""
response: str = Field(description="A conversational response to the user's query")
class FinalResponse(BaseModel):
"""Final response containing either quoted or conversational answer."""
final_output: Union[QuotedAnswer, ConversationalResponse]
@ms-86 How do you generate the parsing/error? Does it appear on a specific class?
I'm guessing you are attaching the pydantic schema to a langchain LLM, like: llm.with_structured_output(schema=FinalResponse). Something similar to this?
@DeterjoSimon You're correct. But since we've been experiencing a lot of hallucinations in our responses (not matching a schema we defined) we decided to include raw answer and in case of parsing error do it on our own using some heuristic:
chat_model.with_structured_output(
schema=FinalResponse, include_raw=True
)
Today I have found in our telemetry something which could help finding a root cause of this bug. It seems that the input for parsing an answer into Pydantic schema is just full of 0x0a (line feed characters)
{
"errors": [
{
"type": "json_invalid",
"loc": "()",
"msg": "Invalid JSON: EOF while parsing a value at line 1804 column 0",
"input": "\n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n \n \n\n",
"ctx": { "error": "EOF while parsing a value at line 1804 column 0" },
"url": "https://errors.pydantic.dev/2.10/v/json_invalid"
}
],
"title": "FinalResponse",
"json": "[{\"type\":\"json_invalid\",\"loc\":[],\"msg\":\"Invalid JSON: EOF while parsing a value at line 1804 column 0\",\"input\":\"\\n \\n (more newline chars..."
}
Is this issue fixed? I'm having it and I don't know if it's because i'm using OpenAI from Azure or something like that