claude-code icon indicating copy to clipboard operation
claude-code copied to clipboard

[Feature Request] Make Agent Aware of token usage and cost

Open brian-ln opened this issue 8 months ago • 1 comments

I need the claude-code agent to be aware of it's token usage and cost at any point in the workflow.

My primary goal is to include token use and cost in commit messages and session summary logs.

brian-ln avatar Apr 12 '25 20:04 brian-ln

I want this. I know how long a context session is good for a certain task type, or how many tokens I think something might take. If Claude can be directed or advised on this in meaningful way this would be very useful.

ddisisto avatar May 20 '25 07:05 ddisisto

Strong +1 for this feature - critical for agent development

I'm building UI/UX analysis agents and hit major blockers around token tracking that make this feature essential:

Current Blockers

1. Max/Pro Subscribers Have Zero Visibility The /cost command explicitly excludes Max/Pro users with "don't worry about tokens" - meaning no data at all. This makes it impossible to:

  • Optimize agent prompts based on actual consumption
  • Compare efficiency between different agent implementations
  • Budget token usage across multi-agent workflows
  • Justify costs to stakeholders

2. Parallel Agent Execution Breaks Manual Tracking Claude Code encourages running agents in parallel for performance, but this makes before/after delta calculation impossible:

  • Multiple agents executing simultaneously → can't attribute token usage
  • Background processes consume tokens → creates noise in measurements
  • Session-level totals only → no per-agent breakdown

3. No Programmatic Access for Agents Agents can't query their own consumption metrics. What's needed is a tool agents can call:

```python GetTokenUsage(scope="current_agent")

Returns:

{ "input_tokens": 1800, "output_tokens": 600, "cache_read_tokens": 450, "cache_creation_tokens": 200, "total": 3050, "model": "claude-sonnet-4-5-20250929" } ```

Real Use Case: UI/UX Analysis Agents

When analyzing multiple components, I need agents to self-report:

``` ✓ UI Analysis Agent completed

  • Analyzed 12 components
  • Found 8 accessibility issues
  • Token usage: 2,400 tokens (~$0.03) • Input: 1,800 tokens • Output: 600 tokens • Cache savings: 450 tokens ```

This enables:

  • Users choosing between fast/cheap vs. thorough/expensive analysis
  • Developers optimizing agent efficiency over iterations
  • Organizations justifying agent resource allocation

Why OpenTelemetry Doesn't Solve This

I saw #6925 was closed with "use OTel" - but that requires:

  • Admin infrastructure setup (not self-service)
  • No agent-level attribution (only session-level)
  • External monitoring stack
  • Still no visibility for Max/Pro users

Agents need self-service access to their own metrics - like how they can call `Read` or `Bash` without infrastructure setup.

Proposed Solution

Add `GetTokenUsage` as a built-in tool (similar to existing tools):

  • Agent-scoped metrics - Track consumption per agent invocation
  • Subtask attribution - When agents spawn sub-agents, track hierarchy (relates to #10164)
  • Max/Pro support - Show token counts even if costs are hidden
  • Real-time access - Agents query during/after execution
  • Zero infrastructure - Works out of the box like other tools

Impact

Without this, building production-quality agents requires blind optimization. Token estimates based on input/output lengths are highly inaccurate with prompt caching, context windows, and multi-turn conversations.

As the agent ecosystem grows, this becomes increasingly critical for developers who need to justify resource allocation and optimize their implementations.

ProgenyAlpha avatar Oct 26 '25 19:10 ProgenyAlpha