pydantic-ai icon indicating copy to clipboard operation
pydantic-ai copied to clipboard

support batch processing

Open darjeeling opened this issue 6 months ago • 2 comments

Description

to reduce pricing, I think it's good to support batch process.

some models support that. how's your priority?

anthropic batch process OpenAI Batch API

References

No response

darjeeling avatar May 19 '25 20:05 darjeeling

From @ bjorkbjork on slack

We've built a dynamic factory that compiles Agents, Tools, and MCP servers from database records into the PydanticAI classes at runtime, and iterates the agent. Right now our main suite of agents that we'll be providing to the customer are a series of Root cause analysis agents - they ingest a lot of data from a clickhouse MCP server, and use this to inform their analysis and search for leads. There is a very big data source to look at, and we're getting rate limited by Anthropic. We reached out to them about raising the TPM limits, and they're trying to push us onto batch processing. If it were supported, what we would likely do is replace the ch_execute_query tool we have with an agent that has access to query the database (anthropic docs claim batches support tool usage) - we send the queries as batched runs that query the data, and then analyze and provide back insights (going through the huge token data sources as batched runs rather than sync), then wait for the completion. We'd probably run a lot of those batches in parallel, as the main agent jumps around the DB looking for clues that can inform new queries. If it's on the backburner I'll just continue without backed processes for now.

i suspect with a few hours of engineering and some async foo, you could do this now.

Ideally you'd make that implementation open source, and it would become the foundation for the built in implementation.

Here's my idea:

  1. The idea relies on the fact that the requests.params matches the standard API format (docs)
  2. You create a new BatchModel implementation of Model: perhaps inheriting from WrapperModel and wraps the AnthropicModel, or perhaps uses some methods from the anthropic model etc. but doesn't wrap it
  3. Model has some (fairly complex) to create batches of messages and fire a request when enough messages come in, then wait for the response and yield the right responses to the right waiting tasks - this is the hard bit, might work best with anyio channels?
  4. Agents use the BatchModel instance as model and operate as normal, there will be a lot of hanging tasks, but that's fine, might work best with #1975 long term

samuelcolvin avatar Jul 03 '25 08:07 samuelcolvin

If you want, I decided to abstract this extra work and support Pydantic response support for batch requests https://github.com/agamm/batchata (MIT, OpenAI and Anthropic support) - it's solely focused on being a batch request library in python

agamm avatar Jul 17 '25 18:07 agamm