Preflight Checklist

[x] I have searched existing issues and this hasn't been reported yet
[x] This is a single bug report (please file separate reports for different bugs)
[x] I am using the latest version of Claude Code

What's Wrong?

"Task Tool Consumes Excessive Tokens for Mechanical Refactoring (5-6x cost overrun)" The Task tool consumed ~77,600 tokens (~$6 USD) for simple mechanical refactoring work that should have cost under $1. This represents a 5-6x cost overrun for straightforward find-replace operations.

What Should Happen?

It should have taken far too less tokens

Error Messages/Logs

Steps to Reproduce

Issue: Task Tool Consumes Excessive Tokens for Mechanical Refactoring

Summary

The Task tool consumed ~77,600 tokens (~$6 USD) for simple mechanical refactoring work that should have cost under $1. This represents a 5-6x cost overrun for straightforward find-replace operations.

Environment

Claude Code Version: Latest (December 2025)
Model: claude-sonnet-4-5-20250929
Task Type: Mechanical refactoring (type fixes, removing code patterns)
Repository: Large full-stack TypeScript/React + Django project

Problem Description

I delegated two mechanical refactoring tasks to specialized agents via the Task tool:

Task 1: Fix `: any` Types in Test Files

What it did:

Agent read 11 test files completely (~40k tokens)
Made simple find-replace changes: : any → { children: React.ReactNode }, etc.
Generated verbose summary with explanations (~2k tokens)

What it should have been:

Direct Edit tool calls with specific line ranges (~5k tokens)
Simple pattern matching and replacement

Token usage: ~42,000 tokens

Task 2: Remove Unnecessary `useMemo`/`useCallback`

What it did:

Agent read 7+ component files completely (~30k tokens)
Removed obvious memoization hooks based on clear rules
Generated extensive summary with reasoning (~5k tokens)

What it should have been:

Read specific sections using Read tool with limits
Direct Edit tool calls for removals (~10k tokens)

Token usage: ~35,600 tokens

Impact

Total cost: ~77,600 tokens ≈ $6 USD
Expected cost: ~15,000 tokens ≈ $1 USD
Waste: ~$5 USD (5-6x more expensive than necessary)

For users on the $20/month plan, this single interaction consumed ~30% of their monthly budget for work that should have cost ~5%.

Root Cause Analysis

The Task tool spawns specialized agents that:

Load full file contents for comprehensive context (even when only changing a few lines)
Generate verbose analysis and detailed summaries (useful for complex work, wasteful for mechanical tasks)
Use separate LLM instances with full context windows
Are optimized for reasoning, not mechanical find-replace operations

This design is excellent for complex analysis but catastrophically inefficient for simple refactoring.

Reproduction Steps

Have a codebase with mechanical refactoring needs (e.g., fix any types, remove patterns)
Ask Claude Code: "Fix all : any types in test files"
Claude delegates to Task tool with frontend-developer agent
Agent reads all files, makes simple changes, generates detailed report
Observe 40k+ token usage for work that should be 5k tokens

Expected Behavior

Claude Code should:

Detect mechanical operations (pattern matching, find-replace, removing specific code patterns)
Warn the user about high token cost before spawning agents
Use direct tools (Edit, Read with limits, Grep) for simple refactoring
Reserve Task tool for genuinely complex work requiring reasoning

Current vs. Ideal Workflow

Current (Inefficient):

User: "Remove unnecessary useMemo hooks"
↓
Claude: Uses Task tool → frontend-developer agent
↓
Agent: Reads entire files (~30k tokens)
Agent: Makes simple edits (removes hooks)
Agent: Generates verbose summary (~5k tokens)
↓
Total: 35k tokens

Ideal (Efficient):

User: "Remove unnecessary useMemo hooks"
↓
Claude: Uses Grep to find useMemo usage (~500 tokens)
Claude: Reads specific sections with Read tool (~2k tokens)
Claude: Makes targeted edits with Edit tool (~3k tokens)
Claude: Reports changes briefly (~500 tokens)
↓
Total: 6k tokens (83% reduction)

Recommendations

Short-term Fixes:

Add cost warnings before spawning Task agents:

⚠️ This will use a specialized agent (~30k tokens, ~$2.40).
For simple refactoring, direct edits would cost ~5k tokens ($0.40).
Continue? [y/N]

Pattern detection for mechanical tasks:
- Detect keywords: "fix all", "remove all", "replace", "find and replace"
- Detect patterns: type fixes, removing hooks, renaming
- Route these to direct tool usage instead of agents

Add --no-agent flag to force direct tool usage:

claude --no-agent "fix any types in test files"

Long-term Improvements:

Smart routing logic:
- Simple refactoring → Direct tools (Read, Edit, Grep)
- Complex analysis → Task agents
- Let users override with explicit flags
Budget-aware agents:
- Agents should use Read with line limits, not full files
- Agents should use Grep to locate changes, not read everything
- Summaries should be concise by default (verbose optional)

Post-task cost reporting:

✅ Task complete
Tokens used: 35,600 (~$2.85)
Estimated cost if done manually: ~6,000 tokens (~$0.48)
Overhead: 83%

Workaround (For Users Now)

When doing mechanical refactoring, explicitly instruct Claude:

"Fix any types in test files.
DO NOT use the Task tool.
Use Grep to find files, Read to check them, and Edit to fix them directly."

This forces Claude to use efficient direct tools instead of spawning agents.

Additional Context

This issue is critical for:

Budget-conscious users on $20/month plans
Large codebases where mechanical refactoring is common
CI/CD automation where token costs compound across runs
Teaching/learning scenarios where users learn bad token-efficiency habits

The Task tool is powerful for complex work, but needs guardrails to prevent misuse on simple operations.

Related Issues

N/A (First report of this specific issue)

Suggested Labels

bug - Inefficient default behavior
cost-optimization - Directly impacts user costs
enhancement - Needs smart routing logic
documentation - Users should know when to avoid Task tool

Claude Model

None

Is this a regression?

Yes, this worked in a previous version

Last Working Version

No response

Claude Code Version

2.0.14

Platform

Anthropic API

Operating System

Windows

Terminal/Shell

Terminal.app (macOS)

Additional Information

No response

Dec 12 '25 05:12 Mangesh-P

Thank you for your suggestions regarding the prompts to give Claude about not using the Task feature: this has helped me a lot!

I've also found that using /config in the CLI to disable auto-compact helps, as this is another source of token overconsumption

If anyone else finds ways to reduce token consumption while we wait, feel free to share! 👍

Dec 12 '25 17:12 grandtheftdisco

Thank you, I will disable /config/Auto-compact and test.

Dec 12 '25 17:12 Mangesh-P

Also: the new background agent feature (run_in_background) appears to spam the main chat window with significant noise when retrieving results via TaskOutput.

This compounds the token consumption problem - not only do the agents themselves consume excessive tokens during execution, but their tool call logs and intermediate outputs leak back into the parent context window when checking on background tasks.

This seems related to #14118 (background subagent tool calls exposed in parent context). The combination of:

Agents reading full files unnecessarily (this issue)
That content then leaking back to parent context (background mode)

...results in even more context window bloat than the original issue describes for foreground agents.

Dec 18 '25 13:12 PaulRBerg

I'm not sure if this background agent problem is a separate issue. If it is, I can open a separate issue.

Dec 18 '25 13:12 PaulRBerg

[BUG] - The Task tool consumed ~77,600 tokens instead of ~15,000 tokens

Preflight Checklist

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Issue: Task Tool Consumes Excessive Tokens for Mechanical Refactoring

Summary

Environment

Problem Description

Task 1: Fix : any Types in Test Files

Task 2: Remove Unnecessary useMemo/useCallback

Impact

Root Cause Analysis

Reproduction Steps

Expected Behavior

Current vs. Ideal Workflow

Current (Inefficient):

Ideal (Efficient):

Recommendations

Short-term Fixes:

Long-term Improvements:

Workaround (For Users Now)

Additional Context

Related Issues

Suggested Labels

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

Task 1: Fix `: any` Types in Test Files

Task 2: Remove Unnecessary `useMemo`/`useCallback`