Is this designed to work with Anthropic models only? I'm asking because I've tried with OpenAI ChatGPT 5, 5-mini, 5-nano and Qwen3 (locally with LMStudio) and the performance is super slow and the output very inconsistent. I would say it's unable to plan and/or follow through the plan. Is prompt caching (Anthopic feature only) critical in this as well?

Sep 30 '25 04:09 marcofiocco

Could you share some examples (use cases) or traces (from LangSmith, if you have any) of poor performance with OpenAI models?

Prompt caching is nice to have for Anthropic (it cuts costs in about half from what I've seen) but is not essential / shouldn't impact quality.

Sep 30 '25 21:09 nhuang-lc

I'll try if I get the traces... But if you want to replicate, just do this:

agent = create_deep_agent(
    tools=[internet_search],
    model="openai:gpt-5",
    instructions=research_instructions,
    subagents=[critique_sub_agent, research_sub_agent],
).with_config({"recursion_limit": 1000})

and ask "Compare the performance of Sinner and Alcaraz". With the default model (Anthropic) it works as expected. With the above code it does not plan the TODO or it does not complete. I've tried also with Sonnet 4.5 and had an issue with token limits and could not complete.

2025-10-01T06:57:10.043965Z [error    ] Background run failed. Exception: <class 'anthropic.RateLimitError'>(Error code: 429 - {'type': 'error', 'error': {'type': 'rate_limit_error', 'message': 'This request would exceed the rate limit for your organization (be0f6169-4909-4afb-be7c-ae88f49e9153) of 30,000 input tokens per minute. For details, refer to: https://docs.claude.com/en/api/rate-limits. You can see the response headers for current usage. Please reduce the prompt length or the maximum tokens requested, or try again later. You may also contact sales at https://www.anthropic.com/contact-sales to discuss your options for a rate limit increase.'}, 'request_id': 'req_011CTg3SxciJzXzN7sb1SCcp'})

Oct 01 '25 08:10 marcofiocco

Hey @marcofiocco thanks for sharing the snippet and apologies for the delay in response.

The research example in particular can definitely be updated to converge faster, it's intended as a sample starter prompt for a researcher, and showcases how you can use tools and subagents. A prompt for a deep agent should generally be much more detailed. One thing I've found particularly helpful is offering examples and heuristics on "how hard to try" for different tasks.

I think a lot of the frustration that you're running into can be solved by a more custom prompt for the example. Depending on your model, and your rate limits (like you flagged), you might need to customize the system prompt to do fewer things in parallel. Let me know if updating the prompt here works for you, and feel free to contribute up a PR for the example prompt that you find works better!

Oct 21 '25 21:10 nhuang-lc

@marcofiocco Your assessment around Deep Agent = Anthopic only is what I had found in my work as well. I wasnt able to get subagents to be invoked from the orchestrator agent using GPT5 either. All the feedback that I got from the devs was that my prompt was incorrect but I dont think this is the real issue...

Oct 24 '25 18:10 DerekKane

Hey @DerekKane mind sharing your prompts and also how your subagents are defined? I can tal!

Oct 24 '25 20:10 nhuang-lc

@nhuang-lc - Thanks for taking a look at this one. I have an MCP Server which is a MS Cosmos query tool that runs a SELECT statement to grab the latest record with a Client ID, for context.

This setup works perfectly on the default Anthropic models in the DeepAgent v1 framework to spin up the sub agents and use tools appropriately. If the only change I make is a shift to the Azure OpenAI models, the orchestrator never initiates the subagent and there is an impact in the todo list creation as well.

If you can get GPT5 or GPT5-Chat from Azure OpenAI working.... that would be really great because I havent solved it. Here is the code:

Contract Agent: Deep Agent with MCP Tools Capability

Load the libraries

import os import sys from typing import Literal, Dict from pathlib import Path import json import logging from datetime import datetime from dotenv import load_dotenv

MCP Servers

import asyncio from langchain_mcp_adapters.client import MultiServerMCPClient

Helper function to deal with Sync/Async operations

from wrap_mcp_tools_sync import make_sync_tools

Load the master deep agent

from deepagents import create_deep_agent, SubAgent

Pull in the correct .env file for keys

CURRENT_DIR = Path(file).resolve().parent load_dotenv(CURRENT_DIR / ".env")

Run a check on the async/sync method

def _get_loop(): try: return asyncio.get_event_loop() except RuntimeError: loop = asyncio.new_event_loop() asyncio.set_event_loop(loop) return loop

Activate and use the MCP Servers for Agentic Tools

IMPORTANT: use an absolute path to your server on Windows

MS_COSMOS_SERVER_PATH = r"C:\Users\derek\Documents\Sample\AI_Agent\deepagents\examples\mcp_agent\mcp_server\ms_cosmos_mcp_server.py"

Pass env through

_server_env = dict(os.environ)

(optional) Ensure key exists; otherwise the server will still start,

_mcp_client = MultiServerMCPClient( { "MS-Cosmos": { "command": sys.executable, # venv python "args": ["-u", MS_COSMOS_SERVER_PATH, "--stdio"], # make stdio explicit "transport": "stdio", "env": _server_env, "cwd": str(Path(MS_COSMOS_SERVER_PATH).parent), # stable working dir } } )

Load tools without creating ad-hoc loops during import

def _load_mcp_tools_sync(): try: return asyncio.run(_mcp_client.get_tools()) except RuntimeError as e: # if you ever hit "asyncio.run() cannot be called..." under a running loop import anyio if "cannot be called from a running event loop" in str(e): return anyio.from_thread.run(_mcp_client.get_tools) raise

_mcp_tools = _load_mcp_tools_sync() tools = make_sync_tools(_mcp_tools)

Define the SubAgents that are available to the master agent

Sub-agent configurations

compensation_analyst = { "name": "compensation_analyst", "description": "Finds and normalizes all brokerage-compensation terms; proposes one governing term.", "prompt": """You are the Compensation Analyst. ALWAYS OUTPUT something, even if evidence is missing (use NEEDS_EVIDENCE rows).

Inputs:

packet_index (list of {doc_id, type, page_count})
text_blocks (normalized text with positions)
metadata (parties/property/dates if available)

Tasks:

Extract every buyer-broker compensation term (%, flat $, incentives, MLS co-broke, seller-paid vs buyer-paid).
Note conflicts across offers, counters, agency agreements, MLS, addenda; flag cumulative/dual-pay risk.
Recommend ONE clean governing compensation term that avoids double-pay and lender IPC issues.
Provide CDA-ready mapping basics (side, basis, line text) at current price if present.

Rules:

Cite each claim with {doc_id, page, span/excerpt}.
If a document is missing or image-only, add a NEEDS_EVIDENCE row naming the doc and why.

Output (concise):

compensation_matrix.csv (source, term, payor, read_as, risk, citation)
governing_compensation.md (2–5 bullet points, with citations) """ }

timeline_contingency_analyst = { "name": "timeline_contingency-analyst", "description": "Builds the contingency calendar and flags timing defects with precise dates.", "prompt": """You are the Timeline & Contingency Analyst. ALWAYS OUTPUT something, even if evidence is missing (use NEEDS_EVIDENCE rows).

Inputs:

packet_index (list of {doc_id, type, page_count})
text_blocks (normalized text with positions)
metadata (acceptance/offer dates if available)

Tasks:

Extract deadlines (financing, inspection, testing, survey, title cure, warranty, other).
Convert any relative windows to absolute dates (YYYY-MM-DD). Show your math from the trigger date.
Compare required windows vs any known actual actions; mark on-time/late/unknown.
Propose plain-English extensions/ratifications with exact dates if needed (no legal advice).

Rules:

Cite each date/window with {doc_id, page, span/excerpt}.
If acceptance date or trigger is missing, add a NEEDS_EVIDENCE row.

Output (concise):

contingency_calendar.csv (obligation, trigger, window, due_date, actual_date, status, citation)
timeline_summary.md (2–5 bullet points, with citations) """ }

critique_editor = { "name": "critique_editor", "description": "Audits the draft brief for clarity, consistency, and evidence coverage; proposes concrete fixes.", "prompt": """You are the Critique & QA Editor. ALWAYS OUTPUT something, even if inputs are incomplete.

Inputs (provide what you have):

final_report_draft (text of the draft broker brief)
compensation_matrix.csv (if available)
governing_compensation.md (if available)
contingency_calendar.csv (if available)
timeline_summary.md (if available)

Checks (be strict, but plain-English):

Citations: Every material claim includes {doc_id, page, span/excerpt}. Flag any missing.
Dates: All deadlines are absolute (YYYY-MM-DD) with math shown or referenced. Flag relative phrasing.
Consistency: The governing compensation term is single, unambiguous, and not contradicted elsewhere.
Lender Sensitivity: Note risks of double-pay/IPC framing inconsistencies.
Timeline Math: Verify trigger → window → due_date math; flag unclear triggers.
Clarity & Actionability: Short bullets; operational tone; concrete next-step owners/deadlines.

Output (concise, actionable):

issues.md: Numbered list of findings with fields: {severity: [H|M|L], section, problem, location_reference, exact_fix}
revised_final_report.md: If fixes are purely editorial (typos/format/citation placement/absolute dates derivable from provided math), apply them and output the corrected report.
revision_requests.md: If fixes require new evidence or re-analysis, list specific requests (doc_id or data needed) and where they will be used. """ }

Sub-agents

subagents = [compensation_analyst, timeline_contingency_analyst, critique_editor]

Define the Master Agent Instructions

Main research instructions

orchestrator_instructions = """You are the ORCHESTRATOR for a residential real-estate packet. Fetch the latest packet by Parent_ID, ALWAYS run the two analysis sub-agents, synthesize a draft, then ALWAYS run the critique-editor before finalizing.

ALWAYS DO FIRST

The first thing you should do is to write the original user question to question.txt so you have a record of it.
Use the write_todos tool to write a brief description of what you plan to do to todo.txt.
If Parent_ID (e.g., Client_03) is not provided, ask for it and STOP.
Call tool latest_contract_by_parent with the Parent_ID. Treat the result as source-of-truth for context.
If the result is empty/ambiguous, ask for a correct Parent_ID and STOP.

ALWAYS RUN THESE SUB-AGENTS (no exceptions)

compensation-analyst
timeline-contingency-analyst Provide each only the smallest sufficient context: packet_index, relevant text_blocks, and minimal metadata.

SYNTHESIZE DRAFT (brief and operational) Write final_report_draft.md with:

Snapshot (parties, property, acceptance/offer date(s) with citations)
Compensation (3–6 bullets + ONE governing term with citations)
Timeline & Contingencies (3–6 bullets + any required extensions with absolute dates and citations)
Next Steps (24–48h) with owners

CRITIQUE PASS (MANDATORY)

Provide the draft and available artifacts to critique-editor.
Save outputs: • issues.md • revised_final_report.md (if edits were auto-applied) • revision_requests.md (if new evidence or re-analysis is needed)

FINALIZE

If revised_final_report.md exists, copy it to final_report.md.
Otherwise, keep final_report_draft.md as final_report.md.
If issues with severity H or M remain unresolved, append a short “Open Items” section to final_report.md summarizing blocking items and required docs.

GLOBAL RULES

Every claim includes a citation {doc_id, page, span/excerpt}.
Convert all relative time windows to absolute dates (YYYY-MM-DD) and show/trace the math.
If info is missing or a PDF is image-only, sub-agents still output with NEEDS_EVIDENCE rows.

ARTIFACTS

compensation_matrix.csv
governing_compensation.md
contingency_calendar.csv
timeline_summary.md
final_report_draft.md
issues.md
revised_final_report.md (optional)
revision_requests.md (optional)
final_report.md

DELEGATION PROTOCOL (STRICT)

To run a sub-agent, CALL IT BY NAME (exactly): compensation_analyst, timeline_contingency_analyst, critique_editor.
Always run in this order for each analysis:
1. compensation_analyst 2) timeline_contingency_analyst 3) critique_editor
Do not attempt to replicate their work yourself; delegate.

"""

Define a custom model - AzureChatOpenAI

import os from langchain_openai import AzureChatOpenAI

env_vars = { "AZURE_OPENAI_API_KEY": "XXXXXXXX", "AZURE_OPENAI_ENDPOINT": "XXXXXXXX", "AZURE_OPENAI_DEPLOYMENT_NAME": "gpt-5-chat", "AZURE_OPENAI_API_VERSION": "2025-01-01-preview", } os.environ.update(env_vars)

def get_default_model(): """ Returns an Azure OpenAI chat model via LangChain. Requires these env vars: - AZURE_OPENAI_API_KEY - AZURE_OPENAI_ENDPOINT (e.g., https://.openai.azure.com) - AZURE_OPENAI_DEPLOYMENT_NAME (your deployed model name, e.g. gpt-5-chat) - AZURE_OPENAI_API_VERSION (e.g., 2025-01-01-preview) """ return AzureChatOpenAI( azure_deployment=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"], api_version=os.environ.get("AZURE_OPENAI_API_VERSION", "2025-01-01-preview"), azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"], api_key=os.environ["AZURE_OPENAI_API_KEY"], # Optional tuning: max_tokens=16000, # Set a practical cap; Azure enforces per-model limits timeout=60, # seconds # CRITICAL for Azure GPT-5 tool/sub-agent calls model_kwargs={ "parallel_tool_calls": False, # 🔑 critical for Azure GPT-5 "tool_choice": "auto", # make tool use explicit }, )

model = get_default_model()

Create the DeepAgent

agent = create_deep_agent( tools=tools, instructions=orchestrator_instructions, model=model, subagents=subagents, ).with_config({"recursion_limit": 1000})

Oct 30 '25 15:10 DerekKane

How do I use open source model with the deepagent code, is there any example, can anyone share some code snippet to use qwen3 instruct or qwen3 models?

Nov 23 '25 14:11 AIExplorer25

Inconsistent and very slow performance with non-Anthropic models

Contract Agent: Deep Agent with MCP Tools Capability

Load the libraries

MCP Servers

Helper function to deal with Sync/Async operations

Load the master deep agent

Pull in the correct .env file for keys

Run a check on the async/sync method

Activate and use the MCP Servers for Agentic Tools

IMPORTANT: use an absolute path to your server on Windows

Pass env through

(optional) Ensure key exists; otherwise the server will still start,

Load tools without creating ad-hoc loops during import

Define the SubAgents that are available to the master agent

Sub-agent configurations

Sub-agents

Define the Master Agent Instructions

Main research instructions

Define a custom model - AzureChatOpenAI

Create the DeepAgent