pydantic-ai icon indicating copy to clipboard operation
pydantic-ai copied to clipboard

[Feature] Add Support for Prompt-Based JSON Parsing Mode as an Alternative to Tool Calling

Open lazyhope opened this issue 11 months ago • 6 comments

Issue Description:

Currently, pydantic-ai implements structured output solely using tool-calling APIs from model providers. While this works in most cases, certain schemas supported by pydantic exhibit inconsistencies between model providers.

For instance, the following schema from the documentation does not work with Gemini models:

class UserProfile(TypedDict, total=False):
    name: str
    dob: date
    bio: str

agent = Agent(
    'gemini-2.0-flash-exp',
    result_type=UserProfile,
)
agent.run_sync("Generate a synthetic data")

This results in the following error:

UnexpectedModelBehavior: Unexpected response from gemini 400, body:
{
  "error": {
    "code": 400,
    "message": "* GenerateContentRequest.tools[0].function_declarations[0].parameters.properties[dob].format: only 'enum' is supported for STRING type\n",
    "status": "INVALID_ARGUMENT"
  }
}

In this example, the inconsistency stems from the model provider's limitations. However, based on my observations working with tools like instructor, modern LLMs are increasingly proficient at adhering to JSON-format prompts in their text responses. In fact, they often perform better in terms of json content in standard completion modes than in tool-calling modes. The Berkeley Function-Calling Leaderboard may provide further evidence of this trend.

Feature Request

Would it be possible for pydantic-ai to implement an alternative mode akin to instructor's MD_JSON mode? This mode could use prompt engineering to guide the LLM’s output and parse the resulting JSON as raw text rather than relying on tool-calling APIs.

Such a feature would:

  • Allow broader compatibility with any model capable of following JSON schema prompts.
  • Address model-specific inconsistencies while leveraging pydantic's full schema flexibility.

Thank you for considering this suggestion!

lazyhope avatar Jan 01 '25 17:01 lazyhope

FYI, here’s an example where instructor’s Markdown JSON mode works seamlessly with the UserProfile schema:

import instructor
from datetime import date
from typing import TypedDict
from litellm import completion

class UserProfile(TypedDict, total=False):
    name: str
    dob: date
    bio: str

client = instructor.from_litellm(completion, mode=instructor.Mode.MD_JSON)  # Switching to `instructor.Mode.TOOLS` would result in the same error mentioned earlier
user = client.chat.completions.create(
    model="gemini/gemini-2.0-flash-exp",
    messages=[
        {"role": "user", "content": "Generate a synthetic data"},
    ],
    response_model=UserProfile,
)

user

yields

UserProfile(name='Alice Wonderland', dob=datetime.date(1990, 3, 15), bio='A curious individual who loves to explore and discover new things.')

lazyhope avatar Jan 01 '25 17:01 lazyhope

@samuelcolvin, could you please take a look at this? My understanding is that we're already using json schemas of models to guide coercing outputs to certain types...

sydney-runkle avatar Jan 02 '25 14:01 sydney-runkle

@samuelcolvin, could you please take a look at this? My understanding is that we're already using json schemas of models to guide coercing outputs to certain types...

Yes, but my proposal is actually to have a mode that, instead of using model providers' tool calling api, parse the raw text response representing a json for a given result_type. This may involve some additional prompting for the model to only output json in its response.

Here is the current implementation of OpenAI models, which parses model's raw response and tool calls separately: https://github.com/pydantic/pydantic-ai/blob/c53c4e10aad816e1caaa51a5e4f19a894322577c/pydantic_ai_slim/pydantic_ai/models/openai.py#L209-L213

Under the proposed json mode, the code may look something like:

if choice.message.content is not None: 
     items.append(result_type.model_validate_json(choice.message.content))

and if the model failed to output a json text or it does not pass validation, retry.

lazyhope avatar Jan 02 '25 14:01 lazyhope

See #514 which is related. You could implement this now in a custom model, I think that's how MistralModel works.

I don't think there's any reason to move or copy that logic into Agent.

samuelcolvin avatar Jan 03 '25 12:01 samuelcolvin

I'd be open to proposals/PRs with tweaks to the current model implementation that would make it easier to subclass/override and add functionality like this.

However, I will note that we can probably improve the handling of schemas with format in their fields independently, I'll open a PR to do that shortly.

dmontagu avatar Jan 03 '25 16:01 dmontagu

Thanks. I'll explore what can be done, as I still believe this is a crucial feature missing from many frameworks.

Its implementation should not introduce significant complexity to the project, as it primarily involves prompting and validating string content using Pydantic models. Moreover, it's broadly applicable across all LLMs.

lazyhope avatar Jan 04 '25 16:01 lazyhope

Currently, the open source model serving project vllm do not support tool_choice=required which will break the stuctured output.

Error Code: 400 - BadRequestError

Details:
OpenAIException - Error Code: 400
{
    "object": "error",
    "message": "[{
        'type': 'value_error',
        'loc': ('body',),
        'msg': 'Value error, `tool_choice` must either be a named tool, \"auto\", or \"none\".',
        'input': {
            'messages': [{'role': 'user', 'content': 'USA Capital'}],
            'model': 'qwen2.5-32b-awq',
            'n': 1,
            'parallel_tool_calls': True,
            'tool_choice': 'required',
            'tools': [{
                'type': 'function',
                'function': {
                    'name': 'final_result',
                    'description': 'The final response which ends this conversation',
                    'parameters': {
                        'properties': {
                            'city': {'title': 'City', 'type': 'string'},
                            'country': {'title': 'Country', 'type': 'string'},
                            'reason': {'title': 'Reason', 'type': 'string'}
                        },
                        'required': ['city', 'country', 'reason'],
                        'title': 'MyModel',
                        'type': 'object'
                    }
                }
            }]
        },
        'ctx': {
            'error': "ValueError('`tool_choice` must either be a named tool, \"auto\", or \"none\".')"
        }
    }]",
    "type": "BadRequestError",
    "param": None,
    "code": 400
}
Received Model Group: qwen2.5-32b
Available Model Group Fallbacks: None

But structured output like openAI is supported:

Request:

{
    "model": "qwen2.5-32b",
    "temperature": 0.1,
    "messages": [
        {
            "role": "user",
            "content": "North city in the US"
        }
    ],
    "extra_body": {
        "guided_json": {
            "properties": {
                "city": {
                    "title": "City",
                    "type": "string"
                },
                "country": {
                    "title": "Country",
                    "type": "string"
                },
                "reason": {
                    "title": "Reason",
                    "type": "string"
                }
            },
            "required": [
                "city",
                "country",
                "reason"
            ],
            "title": "MyModel",
            "type": "object"
        }
    }
}

Output:

{
    "id": "chatcmpl-3d629978021b407d8163add87355a758",
    "created": 1736494263,
    "model": "qwen2.5-32b-awq",
    "object": "chat.completion",
    "system_fingerprint": null,
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "{\"city\": \"Seattle\", \"country\": \"US\", \"reason\": \"Seattle is often referred to as the 'Emerald City' and is located in the northern part of the United States.\"}",
                "role": "assistant",
                "tool_calls": null,
                "function_call": null
            }
        }
    ],
    "usage": {
        "completion_tokens": 42,
        "prompt_tokens": 192,
        "total_tokens": 234,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
    },
    "service_tier": null,
    "prompt_logprobs": null
}

aisensiy avatar Jan 10 '25 07:01 aisensiy

We should support structured outputs as well as tool calls for the result_type where the model supports it.

samuelcolvin avatar Jan 16 '25 10:01 samuelcolvin

Looks like https://github.com/pydantic/pydantic-ai/issues/242 is also related?

Seems like structured outputs is the way to go since many providers support it natively

Finndersen avatar Jan 16 '25 11:01 Finndersen

We should support structured outputs as well as tool calls for the result_type where the model supports it.

Please note that Structured Output API from these two providers both have limitations: they only support a subset of JSON schema. Attributes like additionalProperties won't work. An example Pydantic model that is not supported could look like this:

class User(BaseModel):
    details: dict[
        Annotated[str, Field(description="User name", min_length=1)],
        Annotated[int, Field(description="User ID", gt=3)],
    ] = Field(max_length=1)

Its corresponding json schema:

{
	"properties": {
		"details": {
			"additionalProperties": {
				"description": "User ID",
				"exclusiveMinimum": 3,
				"type": "integer"
			},
			"maxProperties": 1,
			"propertyNames": {
				"description": "User name",
				"minLength": 1
			},
			"title": "Details",
			"type": "object"
		}
	},
	"required": ["details"],
	"title": "User",
	"type": "object"
}

Some useful references: https://platform.openai.com/docs/guides/structured-outputs/examples#supported-schemas https://dylancastillo.co/posts/gemini-structured-outputs.html https://arxiv.org/abs/2408.02442

lazyhope avatar Jan 16 '25 23:01 lazyhope

is there something like CodeAgent in smolagents

They are parsing a code snippet that acts as the tool call, this might be very easy to adopt, providing raw python documentation for a function or a model declaration would be enough

Image

kerolos-sss avatar Jan 31 '25 13:01 kerolos-sss

Hi, I'm using both ollama and llama.cpp, they support schema-constrained JSON as an parameter in the API call as well. I think giving the option to user, let them pick whether to use json schema or tool calling to return structure output might be a good idea. As far as I know, llama.cpp use grammar to enforce json schema, generally this could bring better and more stable output.

References: https://github.com/ggml-org/llama.cpp/tree/master/examples/server#post-v1chatcompletions-openai-compatible-chat-completions-api

henryclw avatar Feb 22 '25 02:02 henryclw

In preparation for this work, I did a short survey how structured outputs work in different models:

There are three ways to support structured outputs:

  • using tool calls - what PydanticAI does now, since all models support tool calling, this can be used with all models although some behave much better than others in this mode
  • using provider structured outputs - fields defined in the provider's API you pass a JSON schema and the model should return JSON matching that schema, only certain models support this
  • using "manual JSON schema mode" - you add to the system prompt a message saying "please return JSON matching this schema" and hope the model does it, you might need to strip markdown code fences to get to the JSON, all models support this since it's manual

Model support for structured outputs:

provider/model support
OpenAI gpt-4o, 4.5, o1 ✅ including schema, see here
OpenAI earlier e.g. 4-turbo 🌦️partial support via JSON mode
Anthropic
Gemini without tools
Gemini with tools ❌(see below)
Groq

The only problematic provider is Google Gemini who allow structure responses (see here) but don't allows tools to be registered if response_mime_type and/or response_schema are provided.

I would therefore propose that we use the best mode for structured outputs for each provider:

  • OpenAI would use response_format for 4o+ models, and tools calls for earlier models
  • Anthropic would use either tool calls or "manual JSON schema mode" - need to check with works best
  • Gemini would use response_schema if there are no tools registered, otherwise use "manual JSON schema mode" I think (I've seen Gemini perform badly with tool calls)
  • Groq would use either tool calls or "manual JSON schema mode" - need to check with works best

Another consideration is whether we should use tools where the return type is a union, where registering multiple tools seems to work well.

Whichever mode we choose we should allow overriding via a:

  • ToolMode wrapper type
  • ManualJSONMode wrapper type
  • a method on models that lets you do whatever you like for structured outputs

Here's the coe I used to investigate this:

OpenAI
from pydantic import BaseModel
from openai import OpenAI
from devtools import debug

client = OpenAI()


class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]
    location_id: int


class EventLocation(BaseModel):
    """Get a location ID for an event by name"""

    name: str


def call_model(messages):
    return client.chat.completions.create(
        model='gpt-4o-2024-08-06',
        messages=messages,
        tools=[
            {
                'function': {
                    'name': EventLocation.__name__,
                    'description': EventLocation.__doc__,
                    'parameters': EventLocation.model_json_schema(),
                },
                'type': 'function',
            }
        ],
        response_format={
            'type': 'json_schema',
            'json_schema': {'name': CalendarEvent.__name__, 'schema': CalendarEvent.model_json_schema()},
        },
    )


messages = [
    {'role': 'system', 'content': 'Extract the event information.'},
    {'role': 'user', 'content': 'Alice and Bob are going to a science fair on Friday.'},
]
completion = call_model(messages)
debug(completion)
message = completion.choices[0].message
call = message.tool_calls[0]
answer = 123
messages += [
    message.model_dump(mode='json'),
    {
        'role': 'tool',
        'content': str(answer),
        'tool_call_id': call.id,
    }
]

completion = call_model(messages)
debug(completion)
Anthropic
import json

from pydantic import BaseModel
from devtools import debug
from anthropic import Anthropic

client = Anthropic()


class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]
    location_id: int


class EventLocation(BaseModel):
    """Get a location ID for an event by name"""

    name: str


system_prompt = f"""
Extract the event information.

As a genius expert, your task is to understand the content and provide
the parsed objects in JSON that match the following json_schema:\n

{json.dumps(CalendarEvent.model_json_schema(), indent=2, ensure_ascii=False)}

Make sure to return an instance of the JSON, not the schema itself.
Response with JSON only, no other text!
"""


def call_model(messages):
    return client.messages.create(
        max_tokens=1024,
        model="claude-3-5-sonnet-latest",
        messages=messages,
        system=system_prompt,
        tools=[
            {
                'name': EventLocation.__name__,
                'description': EventLocation.__doc__,
                'input_schema': EventLocation.model_json_schema(),
            }
        ],
    )


messages = [
    {'role': 'user', 'content': 'Alice and Bob are going to a science fair on Friday.'},
]
response = call_model(messages)
debug(response)
tool_use = next(b for b in response.content if b.type == 'tool_use')
answer = 123
messages += [
    {
        'role': 'assistant',
        'content': [tool_use.model_dump(mode='json')],
    },
    {
        'role': 'user',
        'content': [
            {
                'type': 'tool_result',
                'content': str(answer),
                'tool_use_id': tool_use.id,
            }
        ]
    }
]

response = call_model(messages)
debug(response)
Gemini
import base64
import json
import os

from devtools import debug
from google import genai
from google.genai import types

from pydantic import BaseModel


class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]
    location_id: int


class EventLocation(BaseModel):
    """Get a location ID for an event by name"""

    name: str


def get_event_location_id(name: str) -> int:
    """Get a location ID for an event by name

    Args:
      name: The name of the event location
    """
    return 123


client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

system_prompt = f"""
Extract the event information.

As a genius expert, your task is to understand the content and provide
the parsed objects in JSON that match the following json_schema:\n

{json.dumps(CalendarEvent.model_json_schema(), indent=2, ensure_ascii=False)}

Make sure to return an instance of the JSON, not the schema itself.
Response with JSON only, no other text, no markdown code block!
"""
# system_prompt = 'Extract the event information.'

response = client.models.generate_content(
    model='gemini-1.5-pro-002',
    contents='Alice and Bob are going to a science fair on Friday.',
    config=types.GenerateContentConfig(
        # response_mime_type='application/json',
        # response_schema=CalendarEvent,
        system_instruction=system_prompt,
        tools=[get_event_location_id],
    )
)
debug(response)
Groq
import json

from pydantic import BaseModel
from devtools import debug
from groq import Groq

client = Groq()


class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]
    location_id: int


class EventLocation(BaseModel):
    """Get a location ID for an event by name"""

    name: str


system_prompt = f"""
Extract the event information.

As a genius expert, your task is to understand the content and provide
the parsed objects in JSON that match the following json_schema:\n

{json.dumps(CalendarEvent.model_json_schema(), indent=2, ensure_ascii=False)}

Make sure to return an instance of the JSON, not the schema itself.
Response with JSON only, no other text!
"""


def call_model(messages):
    return client.chat.completions.create(
        messages=messages,
        model='llama-3.3-70b-versatile',
        tools=[
            {
                'type': 'function',
                'function': {
                    'name': EventLocation.__name__,
                    'description': EventLocation.__doc__,
                    'input_schema': EventLocation.model_json_schema(),
                },
            }
        ],
    )


messages = [
    {'role': 'system', 'content': system_prompt},
    {'role': 'user', 'content': 'Alice and Bob are going to a science fair on Friday.'},
]
response = call_model(messages)
debug(response)
message = response.choices[0].message
call = message.tool_calls[0]
answer = 123

messages += [
    {
        'role': 'assistant',
        'tool_calls': [
            {
                'type': 'function',
                'id': call.id,
                'function': {
                    'name': call.function.name,
                    'arguments': call.function.arguments,
                }
            }
        ]
    },
    {
        'role': 'tool',
        'content': str(answer),
        'tool_call_id': call.id,
    }
]

completion = call_model(messages)
debug(completion)

samuelcolvin avatar Mar 24 '25 17:03 samuelcolvin

This sounds like a great plan and would help me a lot.

vLLM, ollama and azure AI all allow structured outputs as well - I guess they are all openAI compatible.

I think they are using outlines or xgrammer in the background to do this. This approach constrains the LLM so it can only return valid JSON by setting the logits for all the invalid token to zero. It actually speed up inference as well as making sure the output is valid.

tomliptrot avatar Mar 24 '25 17:03 tomliptrot

@samuelcolvin not sure I follow what you're saying about Gemini's lacking structured output support "with tools" -- AFAIU tool calling and (forced-JSON-schema-based) structured output are fundamentally mutually exclusive. If the output is forced to conform to a JSON schema, there isn't any leeway for the output to contain tool calls.

OpenAI has an explanation of when to choose one or the other here. But there's no mention of being able to use both together (again, having trouble imagining what that would look like?)

gabrielgrant avatar Mar 24 '25 18:03 gabrielgrant

For Groq I'd recommend using function calls, here's a modified example that seems to work well:

from pydantic import BaseModel
from devtools import debug
from groq import Groq

client = Groq()


class CalendarEvent(BaseModel):
    """Create a calendar event"""

    name: str
    date: str
    participants: list[str]
    location_id: int


class EventLocation(BaseModel):
    """Extract a location and give it an ID"""

    name: str
    location_id: int


system_prompt = f"""
Extract the event information.

As a genius expert, your task is to understand the content of the user's message and extract the event information using function calls.
"""


def call_model(messages):
    tools = [
        {
            "type": "function",
            "function": {
                "name": EventLocation.__name__,
                "description": EventLocation.__doc__,
                "parameters": EventLocation.model_json_schema(),
            },
        },
        {
            "type": "function",
            "function": {
                "name": CalendarEvent.__name__,
                "description": CalendarEvent.__doc__,
                "parameters": CalendarEvent.model_json_schema(),
            },
        },
    ]
    debug(tools)
    return client.chat.completions.create(
        messages=messages,
        model="llama-3.3-70b-versatile",
        tools=tools,
    )


messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
]
response = call_model(messages)
debug(response)
structured_outputs.py:50 call_model
    tools: [
        {
            'type': 'function',
            'function': {
                'name': 'EventLocation',
                'description': 'Extract a location and give it an ID',
                'parameters': {
                    'description': 'Extract a location and give it an ID',
                    'properties': {
                        'name': {
                            'title': 'Name',
                            'type': 'string',
                        },
                        'location_id': {
                            'title': 'Location Id',
                            'type': 'integer',
                        },
                    },
                    'required': [
                        'name',
                        'location_id',
                    ],
                    'title': 'EventLocation',
                    'type': 'object',
                },
            },
        },
        {
            'type': 'function',
            'function': {
                'name': 'CalendarEvent',
                'description': 'Create a calendar event',
                'parameters': {
                    'description': 'Create a calendar event',
                    'properties': {
                        'name': {
                            'title': 'Name',
                            'type': 'string',
                        },
                        'date': {
                            'title': 'Date',
                            'type': 'string',
                        },
                        'participants': {
                            'items': {
                                'type': 'string',
                            },
                            'title': 'Participants',
                            'type': 'array',
                        },
                        'location_id': {
                            'title': 'Location Id',
                            'type': 'integer',
                        },
                    },
                    'required': [
                        'name',
                        'date',
                        'participants',
                        'location_id',
                    ],
                    'title': 'CalendarEvent',
                    'type': 'object',
                },
            },
        },
    ] (list) len=2
structured_outputs.py:63 <module>
    response: ChatCompletion(
        id='chatcmpl-3ebef181-a504-482e-a0e2-84f020796d52',
        choices=[
            Choice(
                finish_reason='tool_calls',
                index=0,
                logprobs=None,
                message=ChatCompletionMessage(
                    content=None,
                    role='assistant',
                    function_call=None,
                    tool_calls=[
                        ChatCompletionMessageToolCall(
                            id='call_8rz0',
                            function=Function(
                                arguments='{"location_id": 1, "name": "science fair"}',
                                name='EventLocation',
                            ),
                            type='function',
                        ),
                        ChatCompletionMessageToolCall(
                            id='call_ggxd',
                            function=Function(
                                arguments=(
                                    '{"date": "Friday", "location_id": 1, "name": "science fair", "participants": ["Al'
                                    'ice", "Bob"]}'
                                ),
                                name='CalendarEvent',
                            ),
                            type='function',
                        ),
                    ],
                ),
            ),
        ],
        created=1742845628,
        model='llama-3.3-70b-versatile',
        object='chat.completion',
        system_fingerprint='fp_90c1d253ff',
        usage=CompletionUsage(
            completion_tokens=60,
            prompt_tokens=402,
            total_tokens=462,
            completion_time=0.218181818,
            prompt_time=0.026596563,
            queue_time=0.10102590499999999,
            total_time=0.244778381,
        ),
        x_groq={
            'id': 'req_01jq4v3krse8n9pkmzqj9w4a3j',
        },
    ) (ChatCompletion)

ricklamers avatar Mar 24 '25 19:03 ricklamers

@samuelcolvin not sure I follow what you're saying about Gemini's lacking structured output support "with tools" -- AFAIU tool calling and (forced-JSON-schema-based) structured output are fundamentally mutually exclusive. If the output is forced to conform to a JSON schema, there isn't any leeway for the output to contain tool calls.

OpenAI has an explanation of when to choose one or the other here. But there's no mention of being able to use both together (again, having trouble imagining what that would look like?)

have you read the code? using tool calling with structuured outputs works well with OpenAI as demonstrated in my code.

samuelcolvin avatar Mar 24 '25 22:03 samuelcolvin

For Groq I'd recommend using function calls, here's a modified example that seems to work well:

Okay @ricklamers, we'll stick to tool calls with groq, that minimises the change anway.

samuelcolvin avatar Mar 24 '25 22:03 samuelcolvin

Excited for this feature - just posting in case someone else runs into my issue with excessive tool calling errors in openai gpt-4o. The root cause seems to be tool_choice: required param that gets set when you set an agent result_type. The model gets way too aggressive about picking tools, so it will end up calling the tool over and over again even when there is a valid response.

I've worked around it by just using text responses - any other state management I handle in the tool calls themselves rather than relying on the model response. So, I think if this feature means its not necessary to set required tool choice, it will likely fix that issue tool (and still get the benefit of structured responses).

zheller avatar Mar 28 '25 20:03 zheller

I'm excited about this but wonder if manual mode should allow switching between JSON and XML. Cline and RooCode use XML for tool calling with manual structured output.

For complex cases, "return JSON matching this schema" often hallucinates. I think if we could access registered tools via RunContext in dynamic system prompts it would allow for stronger in-context learning and multi-tool support. Here's a sample I use:

# Tools

## read_file
Description: Read the file.
Parameters:
- path: (type: string, required: True) The path to the file to read.
<read_file>
  <path>path/to/file</path>
</read_file>

Strider1990 avatar Mar 29 '25 00:03 Strider1990

@samuelcolvin Thank you for taking care of this, I am really looking forward to this feature being added to pydantic-ai!

Just a quick note: Following this discussion, I created a PR to add OpenAI's strict mode which enables JSON schema adherence to tool calls.

OpenAI offers the strict mode also for structured outputs. Do you think there is anything against adding the strict mode for PpenAI structured outputs as an optional parameter? I can take care of it as soon as pydantic-ai supports structured outputs

mscherrmann avatar Mar 31 '25 12:03 mscherrmann

@mscherrmann just a heads up, I don't have the benchmark handy, but when we tested this following the release, strict mode was significantly slower (over 3x slower). This may have improved since we tested, but something to be cautious of.

mike-luabase avatar Apr 03 '25 16:04 mike-luabase

@mike-luabase Thank you for bringing that up; this is a very valid remark. However, this is not surprising. OpenAI discussed this in their presentation of the feature (see here until 29:05). Given a JSON schema, they have to precompute the token masks for all possible states, which is quite computationally intensive and therefore takes some time. However, for a given schema, they only do it once, so subsequent calls are as fast as those without structured outputs. That said, this is why I opt to add the strict mode as an optional parameter, with the default set to false.

mscherrmann avatar Apr 07 '25 14:04 mscherrmann

I think that the way pydantic-ai manages result_type by using forced tooling is a clever idea. Unfortunately, as soon as the structure gets a bit complex (for example in our case a list of typed dicts), most models can't follow this. I am not sure why it happens to be honest. In our testing, locally only qwen2.5 managed to get the format right on the 5th try usually, and Claude 3.7 gets it on the 1st try on the other hand. Most other local models supporting tool calling cannot deal with it. For example:

Image

Image

Image

arty-hlr avatar Apr 08 '25 18:04 arty-hlr

Hi team, does this support Ollama as well? It seems Ollama supports structured output as well

https://ollama.com/blog/structured-outputs

fsw0422 avatar Apr 19 '25 23:04 fsw0422

Reading out through #1304 I did not manage to understand how to force using strict mode not only for tool calling but for the the validation logic of pydantic models

@mscherrmann

Abdullahaml1 avatar May 18 '25 15:05 Abdullahaml1

@Abdullahaml1 See https://github.com/pydantic/pydantic-ai/issues/1445#issuecomment-2889542381

mscherrmann avatar May 19 '25 04:05 mscherrmann

still alive?

fswair avatar Jun 16 '25 09:06 fswair

@fswair Definitely, see https://github.com/pydantic/pydantic-ai/pull/1628.

DouweM avatar Jun 16 '25 12:06 DouweM