[BUG] - The Task tool consumed ~77,600 tokens instead of ~15,000 tokens
Preflight Checklist
- [x] I have searched existing issues and this hasn't been reported yet
- [x] This is a single bug report (please file separate reports for different bugs)
- [x] I am using the latest version of Claude Code
What's Wrong?
"Task Tool Consumes Excessive Tokens for Mechanical Refactoring (5-6x cost overrun)" The Task tool consumed ~77,600 tokens (~$6 USD) for simple mechanical refactoring work that should have cost under $1. This represents a 5-6x cost overrun for straightforward find-replace operations.
What Should Happen?
It should have taken far too less tokens
Error Messages/Logs
Steps to Reproduce
Issue: Task Tool Consumes Excessive Tokens for Mechanical Refactoring
Summary
The Task tool consumed ~77,600 tokens (~$6 USD) for simple mechanical refactoring work that should have cost under $1. This represents a 5-6x cost overrun for straightforward find-replace operations.
Environment
- Claude Code Version: Latest (December 2025)
- Model: claude-sonnet-4-5-20250929
- Task Type: Mechanical refactoring (type fixes, removing code patterns)
- Repository: Large full-stack TypeScript/React + Django project
Problem Description
I delegated two mechanical refactoring tasks to specialized agents via the Task tool:
Task 1: Fix : any Types in Test Files
What it did:
- Agent read 11 test files completely (~40k tokens)
- Made simple find-replace changes:
: any→{ children: React.ReactNode }, etc. - Generated verbose summary with explanations (~2k tokens)
What it should have been:
- Direct
Edittool calls with specific line ranges (~5k tokens) - Simple pattern matching and replacement
Token usage: ~42,000 tokens
Task 2: Remove Unnecessary useMemo/useCallback
What it did:
- Agent read 7+ component files completely (~30k tokens)
- Removed obvious memoization hooks based on clear rules
- Generated extensive summary with reasoning (~5k tokens)
What it should have been:
- Read specific sections using
Readtool with limits - Direct
Edittool calls for removals (~10k tokens)
Token usage: ~35,600 tokens
Impact
- Total cost: ~77,600 tokens ≈ $6 USD
- Expected cost: ~15,000 tokens ≈ $1 USD
- Waste: ~$5 USD (5-6x more expensive than necessary)
For users on the $20/month plan, this single interaction consumed ~30% of their monthly budget for work that should have cost ~5%.
Root Cause Analysis
The Task tool spawns specialized agents that:
- Load full file contents for comprehensive context (even when only changing a few lines)
- Generate verbose analysis and detailed summaries (useful for complex work, wasteful for mechanical tasks)
- Use separate LLM instances with full context windows
- Are optimized for reasoning, not mechanical find-replace operations
This design is excellent for complex analysis but catastrophically inefficient for simple refactoring.
Reproduction Steps
- Have a codebase with mechanical refactoring needs (e.g., fix
anytypes, remove patterns) - Ask Claude Code: "Fix all
: anytypes in test files" - Claude delegates to Task tool with
frontend-developeragent - Agent reads all files, makes simple changes, generates detailed report
- Observe 40k+ token usage for work that should be 5k tokens
Expected Behavior
Claude Code should:
- Detect mechanical operations (pattern matching, find-replace, removing specific code patterns)
- Warn the user about high token cost before spawning agents
- Use direct tools (
Edit,Readwith limits,Grep) for simple refactoring - Reserve Task tool for genuinely complex work requiring reasoning
Current vs. Ideal Workflow
Current (Inefficient):
User: "Remove unnecessary useMemo hooks"
↓
Claude: Uses Task tool → frontend-developer agent
↓
Agent: Reads entire files (~30k tokens)
Agent: Makes simple edits (removes hooks)
Agent: Generates verbose summary (~5k tokens)
↓
Total: 35k tokens
Ideal (Efficient):
User: "Remove unnecessary useMemo hooks"
↓
Claude: Uses Grep to find useMemo usage (~500 tokens)
Claude: Reads specific sections with Read tool (~2k tokens)
Claude: Makes targeted edits with Edit tool (~3k tokens)
Claude: Reports changes briefly (~500 tokens)
↓
Total: 6k tokens (83% reduction)
Recommendations
Short-term Fixes:
-
Add cost warnings before spawning Task agents:
⚠️ This will use a specialized agent (~30k tokens, ~$2.40). For simple refactoring, direct edits would cost ~5k tokens ($0.40). Continue? [y/N] -
Pattern detection for mechanical tasks:
- Detect keywords: "fix all", "remove all", "replace", "find and replace"
- Detect patterns: type fixes, removing hooks, renaming
- Route these to direct tool usage instead of agents
-
Add
--no-agentflag to force direct tool usage:claude --no-agent "fix any types in test files"
Long-term Improvements:
-
Smart routing logic:
- Simple refactoring → Direct tools (Read, Edit, Grep)
- Complex analysis → Task agents
- Let users override with explicit flags
-
Budget-aware agents:
- Agents should use
Readwith line limits, not full files - Agents should use
Grepto locate changes, not read everything - Summaries should be concise by default (verbose optional)
- Agents should use
-
Post-task cost reporting:
✅ Task complete Tokens used: 35,600 (~$2.85) Estimated cost if done manually: ~6,000 tokens (~$0.48) Overhead: 83%
Workaround (For Users Now)
When doing mechanical refactoring, explicitly instruct Claude:
"Fix any types in test files.
DO NOT use the Task tool.
Use Grep to find files, Read to check them, and Edit to fix them directly."
This forces Claude to use efficient direct tools instead of spawning agents.
Additional Context
This issue is critical for:
- Budget-conscious users on $20/month plans
- Large codebases where mechanical refactoring is common
- CI/CD automation where token costs compound across runs
- Teaching/learning scenarios where users learn bad token-efficiency habits
The Task tool is powerful for complex work, but needs guardrails to prevent misuse on simple operations.
Related Issues
- N/A (First report of this specific issue)
Suggested Labels
bug- Inefficient default behaviorcost-optimization- Directly impacts user costsenhancement- Needs smart routing logicdocumentation- Users should know when to avoid Task tool
Claude Model
None
Is this a regression?
Yes, this worked in a previous version
Last Working Version
No response
Claude Code Version
2.0.14
Platform
Anthropic API
Operating System
Windows
Terminal/Shell
Terminal.app (macOS)
Additional Information
No response
Thank you for your suggestions regarding the prompts to give Claude about not using the Task feature: this has helped me a lot!
I've also found that using /config in the CLI to disable auto-compact helps, as this is another source of token overconsumption
If anyone else finds ways to reduce token consumption while we wait, feel free to share! 👍
Thank you, I will disable /config/Auto-compact and test.
Also: the new background agent feature (run_in_background) appears to spam the main chat window with significant noise when retrieving results via TaskOutput.
This compounds the token consumption problem - not only do the agents themselves consume excessive tokens during execution, but their tool call logs and intermediate outputs leak back into the parent context window when checking on background tasks.
This seems related to #14118 (background subagent tool calls exposed in parent context). The combination of:
- Agents reading full files unnecessarily (this issue)
- That content then leaking back to parent context (background mode)
...results in even more context window bloat than the original issue describes for foreground agents.
I'm not sure if this background agent problem is a separate issue. If it is, I can open a separate issue.