Claude sonnet 4 is very eager to create sub-agents, and goose does not handle it well
I ran the "create a tamagotchi game" test, and:
- the main agent is instructed to build a game
- the main agent immediately creates a sub-agent to handle the whole thing
- the sub agent tries to also create a sub-agent
- the recursive sub-agent call fails (as it should)
- the sub-agent then replies with some code directly (no developer tools are called) and reports success to the main agent
- the main agent summarizes this reply
so in the end, goose reported that everything worked. And no code was written.
goose version: 1.15.0 (pre release e9b9dcc3634bcc1a12504e33a29baaea631f6ddc)
/cc @DOsinga @alexhancock . I know the other week we started seeing recursive sub-agents causing issues and did #5659 -- this must have been around the time when model behavior changed?
Yeah I've seen eagerness to create subagents across multiple models/providers. We no longer have subagent based instructions in the system prompt.
Is it perhaps the tool description for create_dynamic_task_tool? https://github.com/block/goose/blob/899100a5e4b6a3ea3f3df6e87f2e0931ddc94ae8/crates/goose/src/agents/recipe_tools/dynamic_task_tools.rs#L117
I have not seen this for a while, I wonder if it is specific detailed prompts (any time I have seen it - didnt' seem over eager)
This might have been partially my fault: I looked back at the session logs and it turns out I had no extensions enabled. So without a developer tool, the task I gave goose was not really possible to complete. That said, the model decided to try to do it with a sub-agent, which does make sense -- the model may "think" that the sub-agent will have that capability.
I think for one we should disable sub-agents entirely if you have no extensions enabled.
should be an easy fix
created #5825 for this
https://github.com/block/goose/pull/5659 https://github.com/block/goose/pull/5441 should resolve this
subagents with no extensions may be a valid use case, eg the main agent can debate or brainstorm with a subagent
it may be a valid use case, but we'd have to change the prompts all over. right now subagents without tools realize they don't have tools and then fall back to telling the user what to do (which for the main agent is reasonable). the main agent then thinks that the subagent did those things and it will report back to the user that everything is done.