I need the claude-code agent to be aware of it's token usage and cost at any point in the workflow.

My primary goal is to include token use and cost in commit messages and session summary logs.

Apr 12 '25 20:04 brian-ln

I want this. I know how long a context session is good for a certain task type, or how many tokens I think something might take. If Claude can be directed or advised on this in meaningful way this would be very useful.

May 20 '25 07:05 ddisisto

Strong +1 for this feature - critical for agent development

I'm building UI/UX analysis agents and hit major blockers around token tracking that make this feature essential:

Current Blockers

1. Max/Pro Subscribers Have Zero Visibility The /cost command explicitly excludes Max/Pro users with "don't worry about tokens" - meaning no data at all. This makes it impossible to:

Optimize agent prompts based on actual consumption
Compare efficiency between different agent implementations
Budget token usage across multi-agent workflows
Justify costs to stakeholders

2. Parallel Agent Execution Breaks Manual Tracking Claude Code encourages running agents in parallel for performance, but this makes before/after delta calculation impossible:

Multiple agents executing simultaneously → can't attribute token usage
Background processes consume tokens → creates noise in measurements
Session-level totals only → no per-agent breakdown

3. No Programmatic Access for Agents Agents can't query their own consumption metrics. What's needed is a tool agents can call:

```python GetTokenUsage(scope="current_agent")

Returns:

{ "input_tokens": 1800, "output_tokens": 600, "cache_read_tokens": 450, "cache_creation_tokens": 200, "total": 3050, "model": "claude-sonnet-4-5-20250929" } ```

Real Use Case: UI/UX Analysis Agents

When analyzing multiple components, I need agents to self-report:

``` ✓ UI Analysis Agent completed

Analyzed 12 components
Found 8 accessibility issues
Token usage: 2,400 tokens (~$0.03) • Input: 1,800 tokens • Output: 600 tokens • Cache savings: 450 tokens ```

This enables:

Users choosing between fast/cheap vs. thorough/expensive analysis
Developers optimizing agent efficiency over iterations
Organizations justifying agent resource allocation

Why OpenTelemetry Doesn't Solve This

I saw #6925 was closed with "use OTel" - but that requires:

Admin infrastructure setup (not self-service)
No agent-level attribution (only session-level)
External monitoring stack
Still no visibility for Max/Pro users

Agents need self-service access to their own metrics - like how they can call `Read` or `Bash` without infrastructure setup.

Proposed Solution

Add `GetTokenUsage` as a built-in tool (similar to existing tools):

Agent-scoped metrics - Track consumption per agent invocation
Subtask attribution - When agents spawn sub-agents, track hierarchy (relates to #10164)
Max/Pro support - Show token counts even if costs are hidden
Real-time access - Agents query during/after execution
Zero infrastructure - Works out of the box like other tools

Impact

Without this, building production-quality agents requires blind optimization. Token estimates based on input/output lengths are highly inaccurate with prompt caching, context windows, and multi-turn conversations.

As the agent ecosystem grows, this becomes increasingly critical for developers who need to justify resource allocation and optimize their implementations.

Oct 26 '25 19:10 ProgenyAlpha

[Feature Request] Make Agent Aware of token usage and cost

Current Blockers

Returns:

Real Use Case: UI/UX Analysis Agents

Why OpenTelemetry Doesn't Solve This

Proposed Solution

Impact