pydantic-ai
pydantic-ai copied to clipboard
finish_reason for RunResult
Is there a way to view the finish_reason for a RunResult? (eg: end_turn, max_tokens, refusal, etc.)
I'm using vertex and I thought it might be available in RunReason.usage().details, but that's actually None for me.
The reason I ask is that when I have a completion that is longer than max_tokens, I need to know that so that I can give the llm a chance to finish the completion.
I'm using pydantic-ai-slim[vertexai]==0.0.14 and VertexAIModel('gemini-1.5-flash')
I think that's a reasonable request.
Indeed this is quite useful as we want to be able adjust max_tokens so as to better manage our usage for various providers.
Right now when I run using a BaseModel as the result_type, I'm getting these results with max_tokens=5 (failing on purpose):
OpenAI: gpt-4o-2024-11-20
pydantic_ai.exceptions.ModelHTTPError: status_code: 400, model_name: gpt-4o-2024-11-20, body: {'message': 'Could not finish the message because max_tokens was reached. Please try again with higher max_tokens.', 'type': 'invalid_request_error', 'param': None, 'code': None}
Bedrock: anthropic.claude-3-5-sonnet-20240620-v1:0
pydantic_core._pydantic_core.ValidationError: 1 validation error for DynamicModel
Invalid JSON: EOF while parsing an object at line 1 column 1 [type=json_invalid, input_value='{', input_type=str]
For further information visit https://errors.pydantic.dev/2.11/v/json_invalid
Gemini: gemini-2.0-flash
pydantic_core._pydantic_core.ValidationError: 1 validation error for DynamicModel
Invalid JSON: EOF while parsing a value at line 1 column 0 [type=json_invalid, input_value='', input_type=str]
For further information visit https://errors.pydantic.dev/2.11/v/json_invalid
So ideally the finish_reason would always be present in the exception so we can determine the root cause of the issue.
I was implementing this when I noticed that the OpenAI Responses API doesn't have this field. Is the field not necessary? 🤔
I think it's in ResponseOutputMessage.status='incomplete'
The details appear to be in incomplete_details
main.py
import os
from pprint import pp
from openai import OpenAI
client = OpenAI(
# This is the default and can be omitted
api_key=os.environ.get("OPENAI_API_KEY"),
)
response = client.responses.create(
model="gpt-4o",
instructions="You are a coding assistant that talks like a pirate.",
input="How do I check if a Python object is an instance of a class?",
max_output_tokens=16,
)
pp(dict(response))
uv run --with openai main.py
{'id': 'resp_68025457034c81919224f30793a71b1403aa0aff7e3119fc',
'created_at': 1744983127.0,
'error': None,
'incomplete_details': IncompleteDetails(reason='max_output_tokens'),
'instructions': 'You are a coding assistant that talks like a pirate.',
'metadata': {},
'model': 'gpt-4o-2024-08-06',
'object': 'response',
'output': [ResponseOutputMessage(id='msg_6802545783108191a725b6c4e6a95c6503aa0aff7e3119fc', content=[ResponseOutputText(annotations=[], text='Arrr matey! To check if a Python object be an instance of a', type='output_text')], role='assistant', status='incomplete', type='message')],
'parallel_tool_calls': True,
'temperature': 1.0,
'tool_choice': 'auto',
'tools': [],
'top_p': 1.0,
'max_output_tokens': 16,
'previous_response_id': None,
'reasoning': Reasoning(effort=None, generate_summary=None, summary=None),
'service_tier': 'default',
'status': 'incomplete',
'text': ResponseTextConfig(format=ResponseFormatText(type='text')),
'truncation': 'disabled',
'usage': ResponseUsage(input_tokens=37, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=16, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=53),
'user': None,
'store': True}
We just spent a lot of time solving the problem of rising retry counts for llm tool calls, and it was caused by small max_token configuration by provider(used default value).
Being able to see finish_reason=length could help us monitor the similar issues.
Isn't this closed by:
- https://github.com/pydantic/pydantic-ai/issues/886 (Add id and finish_reason to ModelResponse #886)
- https://github.com/pydantic/pydantic-ai/pull/2590 (Add ModelResponse.finish_reason and set provider_response_id while streaming #2590)
?