superpowers Cross-provider subagents

This is just a discussion, not a requirement submission. I believe GPT or Gemini have better mathematical and logical capabilities and longer context (their training tendencies differ slightly from Claude's), and outsourcing this part may save the main agent's context length. Additionally, Claude is indeed somewhat too expensive. Would it be better to package some relatively independent tasks and hand them over to Codex? My concern is whether this would disrupt the existing workflow (especially debugging and fixing, which is more tightly coupled with superpower than code review)?

Dec 01 '25 17:12 FYZAFH

I have some interest in cross-provider agent swarms, but I think we're still a long way from baking that into superpowers. Also, I'd be fairly uncomfortable with building functionality that required folks to be paying multiple providers. (And just for context length, sonnet[1m] is pretty nice.)

I'm not going to close this out, but yeah. this isn't a near term thing.

Dec 01 '25 22:12 obra

Would outsourcing Phases 1–3 of systematic debugging to a dedicated subagent (or specialized bug-analysis tool) produce better results? Such a tool/subagent would encapsulate the prompts for Phases 1–3, take a bug description as input, and directly return the root cause (or a highly focused hypothesis about it).

Dec 02 '25 17:12 FYZAFH

I think that there's probably a more generalized bit of guidance around "outsource context-heavy activities" - but also I'm not a big fan of 'specialized' agent profiles unless you need different tool use profiles or hardcoded models.

Dec 02 '25 23:12 obra

I wanted to add this to the discussion: effective harnesses for long running agents

It feels like there is a convergence of the memory-for-context and specialists-for-tasks paradigms that speaks to what @FYZAFH is asking. I know my researchers working in math-heavy code bases find the Claude models weaker for their work, and I think eventually we'll want specialists if nothing else to push costs down to the lowest tier possible (already happening with Haiku).

I can understand that you can't just freely assimilate this into the scope of your work, but it does feel we're headed that direction.

Dec 04 '25 13:12 Troubladore

You completely understand what I mean @Troubladore . On one hand, context is very valuable (this is not because the context is insufficient, but because model performance drops sharply after context inflation). On the other hand, Claude is indeed lacking in algorithms, mathematics, and clarifying bug logic; sometimes reviews also miss the key points. But simply outsourcing some work to tools or subagents that are better at specialized problems is not realistic, because new tools/subagents know nothing about the previous context. The agent calling them needs to brief them on the current situation (e.g., for debug analysis, it needs to organize and pass the bug description). It also needs to judge whether their output is reliable. In exchange, it avoids having to investigate the bug itself and doesn't need to know how to investigate complex bugs. However, some bugs were just written by the agent itself, and all causes that should lead to the bug are already in the context—no need for tools/subagents. I think the key is to have a clear boundary for when to use them. I wrote a bug-analysis tool and specified when to use it like this:

## Decision: Use Analyzer or personally analyze?

**Use bug analyzer tool when:**
- Bug in code you didn't write (need to understand system)
- Investigation requires 3+ file reads
- Root cause not immediately obvious
- Want to preserve flow state (actively working on something else)

**personally analyze when:**
- Bug in code you just wrote (< 1 hour ago, context in working memory)
- Root cause obvious from error message (< 30 seconds to identify)
- Fix is local (single file, few lines)

**Rule of thumb:** If you wrote the code recently AND root cause is obvious, personally analyze. Otherwise, use analyzer.

Of course, this still requires a lot of optimization, testing, and benchmarking.

Dec 05 '25 08:12 FYZAFH

Does Cursor have subagents?

Dec 10 '25 15:12 TomLucidor

I've been experimenting with using the codex-cli as an MCP tool for Claude Code.

This can be done by adding the following to ~/.claude.json

"codex": {
      "type": "stdio",
      "command": "codex",
      "args": [
        "-m",
        "gpt-5.2-codex",
        "-c",
        "model_reasoning_effort=high",
        "mcp-server"
      ],
      "env": {}
    }

You get these 2 available commands:

Tool Name: codex
Full Name: mcp__codex__codex

Description

Run a Codex session. Accepts configuration parameters matching the Codex Config struct.

Parameters

prompt (required): string The initial user prompt to start the Codex conversation.
approval-policy: string Approval policy for shell commands generated by the model: untrusted, on-failure, on-request, never.
base-instructions: string The set of instructions to use instead of the default ones.
compact-prompt: string Prompt used when compacting the conversation.
config: object Individual config settings that will override what is in CODEX_HOME/config.toml.
cwd: string Working directory for the session. If relative, it is resolved against the server process's current working directory.
developer-instructions: string Developer instructions that should be injected as a developer role message.
model: string Optional override for the model name (e.g., "o3", "o4-mini").
profile: string Configuration profile from config.toml to specify default options.
sandbox: string Sandbox mode: read-only, workspace-write, or danger-full-access.

Tool Name: codex-reply
Full Name: mcp__codex__codex-reply

Description

Continue an existing Codex conversation by providing the conversation ID and a new prompt.

Parameters

conversationId (required): string The unique identification string for the specific Codex session you wish to continue.
prompt (required): string The next user prompt to send to the model to continue the conversation flow.

Still experimenting how to possibly implement this into the greater superpowers framework. Right now its very useful to manually call during planning mode for deep architectural and codebase reviews.

Has anyone else tried this? Any ideas?

Dec 19 '25 21:12 seanGSISG