pydantic-ai
pydantic-ai copied to clipboard
How can one simply return the response of the tool, instead of routing the response to a final result handler?
I'm assuming a result_tool is always needed.
In this case I want the tool to be another agent, and I just want the response from that agent, without any other LLM call.
I don't really understand the question I'm afraid.
Result tools are not required - if the return type is str, no tool is used.
And there aren't any LLM calls after the result is returned.
@pedroallenrevez I've implemented one idea @samuelcolvin and I discussed for addressing this in #142. I think it works and makes it possible for tool calls to be the "result tool", and makes it possible to disable the default schema-based result tool.
But I don't love the approach. Repeating what I wrote at the end of the PR body:
However, I personally find this to be kind of an awkward set of APIs to implement what I feel is a fairly reasonable pattern of wanting to require the model to call a particular function as the way it ends its execution. I would propose we add a new decorator
@agent.result_toolthat requires you to return something compatible with the return type of the agent and ends the run when it is called. And if you use the@agent.result_tooldecorator anywhere, we disable the default (schema-based) result tool.
Do you have any reactions to either of these proposals? In principle we could support both but I suspect @samuelcolvin might not want two different ways to override the default behavior. While I think the @agent.result_tool decorator is a better experience most of the time, the downside is that without the ctx.end_run feature, there wouldn't be a way to have a tool exit the run early. I haven't been able to think of scenarios where that would be significantly more useful than just requiring separate functions for @agent.tool tools and @agent.result_tool tools, but I also wouldn't be surprised if good use cases for dynamically deciding whether a tool call should end the run do exist.
@jlowin — @samuelcolvin mentioned you had some ideas/thoughts on this issue, feedback on either of the proposals above is very much welcome.
Thanks @dmontagu for the thoughtful proposals. We had a chance to discuss offline, so I will try to summarize here --
I think we can distill or reframe the core issue and solution:
The fundamental problem is that using a user-supplied tool in Pydantic AI currently requires two LLM calls: one to use the tool and another to generate the final response. We want to enable single-call workflows if the user knows that a single call satisfies their objective, while maintaining strong typing.
Rather than embedding termination logic within tools via ctx.end_run(), I believe the cleaner approach is your proposed @agent.result_tool decorator (or result_tool_plain, or however you spell it). This lets us:
- Keep tools focused purely on business logic while moving termination decisions to the agent configuration level
- Maintain strong typing by using the tool's return type
- Support multiple potential terminal actions through decoration/configuration (because multiple tools with simple args >> one tool with complex args)
For users who need dynamic termination logic, they can achieve this by giving their agent both regular and result tools, letting the agent itself make the choice of which to call. This keeps the architecture clean while supporting all needed use cases. The ultimate consequence of all of this is that the user is supplying the terminal tool(s) instead of having PydanticAI auto-generate one.
I think this aligns well with Pydantic AI's focus on composable, strongly-typed LLM invocations!
Even though I just mildly understand ctx.end_run means at this point, I think the decorator approach feels a lot more ergonomic. I think there is space for managing state of an agent inside the tools by defining an EndAgentRun, but doesn't feel intuitive for now.
Just a small observation:
- Wouldn't it feel more natural for this specific case to have an argument in the decorator such as:
@agent.tool(bypass=True)or some other taxonomy. - Also to the point of using a
result_tool, I might have some logic that requires calling an LLM afterwards, and another that doesn't, so having one tool feels awkward here, and I would naturally just default for my tools to handle all the logic, and the result tool be an awkward bypass of information. Unless I'm not understanding the full scope.
Keep tools focused purely on business logic while moving termination decisions to the agent configuration level
I feel this is the key-point to achieve natural intuiteveness when building agents. Because there is also a world where things get more complicated and there can be (strongly typed) trajectories between tools. So, termination of state, as well as routing to a next step should be manageable to the user, though I feel this might not be of immediate need for pydantic-ai.
Thanks people :)
There is a question of how to perform validation when registering multiple response tools. Presumably a single validation function still works but the pattern of using isinstance to match the result to the type may not. For example if I have two result tools that both return a list of ints. My recommendation would be that your custom function is called and does its validation right then and there, and its output is used as the agent's final result:
@result_tool
def f(x:list[int]) → list[int]:
... # validate and return
@result_tool
def g(x: list[int]) → list[int]:
... # validate and return
(maybe this is obvious and falls out of the design but just in case)
Any news on this? Just grazed over this video of yours @samuelcolvin https://www.youtube.com/watch?v=YRYxsb_VLhI where you talked about it! Or any hacky workarounds that we could use now?
Second this. Tool configured to return very specific targeted result to certain class of semantic requests but as of now LLM just keeps chewing nonsense on it to eventually blow up through retries and dies.
Making tool simply return response to user directly without any modifications will solve lots of agentic problems. It's a show stopper as is right now.
I agree with @pedroallenrevez that if a decorator approach is desired, it would probably be best as an argument to the @agent.tool() instead of an entirely new method?
What about some special EndConversation exception type that the tool could raise if it doesn't need to or has no response to return to the LLM? I guess this might not work if the tool returns information that needs to be provided to the user.
BTW I have also raised this issue https://github.com/pydantic/pydantic-ai/issues/675 which is somewhat relevant, for cases where the LLM returns a message text along with a final tool call
Hey
I'm seeking a method to control the sequence and manage dependencies between tool executions within Pydantic AI agents, as the current automatic execution model lacks the necessary flexibility.
For example, in a trip planner, obtaining the hotel's address is essential before booking a flight. By defining such dependencies, I can prevent the LLM from getting stuck in loops.
I'm looking forward to test Pydantic Graph and plan to implement a custom "identify_tasks" node at the beginning of the graph to list all necessary tool calls based on the user query.
In summary, I'm looking for a feature akin to bind_tools that returns a list of tool calls with their arguments, enabling manual management of execution sequences and dependencies. If there's an alternative solution to address this need, please let me know.
Thx
Hello guys, anything on this one, that would be really helpful 🙏
Seconded - just been battling with the final agent not wanting to play ball when the tool returns exactly what the user needs for that response.
+1
+1 Keen on this! Especially when running a multi-agent system.
In my case, once the manager agent has received all the data from the other agents, i'd like for it to respond to the user via an answering tool that has its own system prompt, I don't want it to answer by itself, nor, as it currently does, by re-creating a response after it gets the final response with the answering tool.
+1
I'm trying to implement a system where the main agent can delegate to "sub-agents", which have strictly defined output formats depending on various other factors. Currently, the main agent takes the sub-agent response and mangles it all up.
+1 on this.
This is pretty fundamental in my opinion. Let's say you want your agent to act on a Model and you want to give it controllers to affect the Model, right now the "cleanest" way is to set the controllers as result types and at the end of the run merge those to the Model. You should just be able to provide the controllers to the LLM and end a run on their use.
+1
I'm trying to implement a system where the main agent can delegate to "sub-agents", which have strictly defined output formats depending on various other factors. Currently, the main agent takes the sub-agent response and mangles it all up.
FWIW and for anyone with a similar use case as me, I just have a 'delegator' agent who has access to a list of 'sub' agent names and calls the right one, and I just use that to query a dictionary of agent names to agents to call it and get the result directly. This is slow, and the first query feels a bit overkill, but it works for now.
I found one funny workaround for this problem, see that workaround here.