[Bug] Excessive token usage on new session initialization
Bug Description [report] could you please explain me what you did with new cleared session, with "Current session" 31% and "Weekly limits" 4%. Guys it's robbery. I've just started session, only one message for planning, only small part of the project and vala unbelievable usage of tokens. Very bad!!!!
It was think off, but Opus 4.5 by fault. But it can't use this one just creating one plan file with 620 line of text (including code examples)
Environment Info
- Platform: darwin
- Terminal: iTerm.app
- Version: 2.0.64
- Feedback ID: 7e3c88dc-8f71-47e6-aa73-23fdd3696399
Errors
[{"error":"Error\n at xw (/$bunfs/root/claude:189:1143)\n at <anonymous> (/$bunfs/root/claude:190:10080)\n at emit (node:events:92:22)\n at endReadableNT (internal:streams/readable:861:50)\n at processTicksAndRejections (native:7:39)\n at request (/$bunfs/root/claude:192:2147)\n at processTicksAndRejections (native:7:39)","timestamp":"2025-12-10T07:19:08.470Z"},{"error":"Error\n at xw (/$bunfs/root/claude:189:1143)\n at <anonymous> (/$bunfs/root/claude:190:10080)\n at emit (node:events:92:22)\n at endReadableNT (internal:streams/readable:861:50)\n at processTicksAndRejections (native:7:39)\n at request (/$bunfs/root/claude:192:2147)\n at processTicksAndRejections (native:7:39)","timestamp":"2025-12-10T07:19:08.721Z"},{"error":"ConfigParseError: Invalid schema: name: Marketplace name cannot impersonate official Anthropic/Claude marketplaces. Names containing \"official\", \"anthropic\", or \"claude\" in official-sounding combinations are reserved.\n at BBB (/$bunfs/root/claude:1361:869)\n at CuR (/$bunfs/root/claude:1361:2934)\n at async Ok (/$bunfs/root/claude:1363:601)\n at async cH_ (/$bunfs/root/claude:4478:5773)\n at processTicksAndRejections (native:7:39)","timestamp":"2025-12-10T07:19:13.089Z"},{"error":"Error: Request was aborted.\n at _createMessage (/$bunfs/root/claude:128:3151)\n at processTicksAndRejections (native:7:39)","timestamp":"2025-12-10T07:28:43.041Z"}]
Found 3 possible duplicate issues:
- https://github.com/anthropics/claude-code/issues/12333
- https://github.com/anthropics/claude-code/issues/13532
- https://github.com/anthropics/claude-code/issues/13326
This issue will be automatically closed as a duplicate in 3 days.
- If your issue is a duplicate, please close it and 👍 the existing issue instead
- To prevent auto-closure, add a comment or 👎 this comment
🤖 Generated with Claude Code
Same here
Same here, I just started my new 5-hour session, and I did only "/compact" and checked the usage to find it spent 9% !!!!
The previous session was consumed so freaking fast, and I felt that there was something wrong because I never reach near the limit, but I said maybe I slept on it. But now, it has become obvious that there is some sort of a bug in the way the session limit is being calculated.
Please fix.
Same here. Every time I JUST OPEN it consumes 4%
+1
Today the same with version 2.0.64, do nothing at all, just exit, enter typed /context +2%, type custom command name, stopped immediately +2% and Current session 4% for nothing. it's bug and should be fixed @claude
Absolutely the same happens to me. Before 2.0.64 with Max plan for me it was impossible (based on my coding tasks/style) to hit the session (5h) limit. After 2.064 - I hit the limit in3.5 hours and was proposed to upgrate plane to bigger capacity. @claude
seems @claude want to rob us showing big income, before go to IPO
As a temporary solution, try to downgrade the version to claude install 2.0.62 and after claude is started, check "Thinking" mode is disabled (for me it always starts with Thinking mode on even with "alwaysThinkingEnabled": false in settings.json)
@eoris did the Claude install 2.0.62 but it's stills showing as 2.0.65, is there a catch?
@eoris did the Claude install 2.0.62 but it's stills showing as 2.0.65, is there a catch?
It auto updates to latest version always. @eoris please let me also know how to prevent auto update.
@rbarcante @Maharaj95 try to add DISABLE_AUTOUPDATER=1 to .bashrc or .zshrc
Same here. Hitting a 5 hour limit within 30 minutes without any change in my workflow.
⭐ TIPS:
I've been browsing similar area:cost issues re: Claude this morning and have gathered a few tips in one place. These don't fix the issue entirely but they've helped cut down my usage in the interim: hope these help someone!
- direct Claude to not use Task feature, and to use utility tools for common tasks. Add
–no-agent flagto your prompt as well (several users reporting that Claude is recruiting subagents without disclosing this to the user)
- ie “ --no-agent for the remainder of our work in this session. DO NOT use the Task tool. Use Grep to find files, Read to check them, and Edit to fix them directly."
- use
/configin the CLI to disable auto-compact, as this is another source of token overconsumption
If anyone else finds ways to reduce token consumption while we wait, feel free to share! 👍
⭐ TIPS: I've been browsing similar
area:costissues re: Claude this morning and have gathered a few tips in one place. These don't fix the issue entirely but they've helped cut down my usage in the interim: hope these help someone!
- direct Claude to not use Task feature, and to use utility tools for common tasks. Add
–no-agent flagto your prompt as well (several users reporting that Claude is recruiting subagents without disclosing this to the user)
- ie “ --no-agent for the remainder of our work in this session. DO NOT use the Task tool. Use Grep to find files, Read to check them, and Edit to fix them directly."
- use
/configin the CLI to disable auto-compact, as this is another source of token overconsumptionIf anyone else finds ways to reduce token consumption while we wait, feel free to share! 👍
Thank you so much. This worked for me to reduce it down significantly. I switched to sonnet as well just for extra safety net.
@Maharaj95 HOORAY! I'm glad it helped you! Hopefully they fix this soon 😖
same here, any updates?
MCP tools can consume a bunch of the context.
Use /context to understand how your context is being used, and consider using the @ to toggle some MCP servers when you don't need them.
Also, autocompact doesn't actually consume context by spending tokens. It reserves some tokens so it has room at the end of the session. This doesn't cost you so much as reduce your available window.
I have disabled it to give myself more headroom.
For efficiency, consider also using haiku sub-agents for some tasks (like running commands) to reduce the main context usage and use a more cost-effective model.
📌 Token consumption observations & mitigation strategies (Claude CLI)
Thanks for sharing these tips — they’ve been genuinely helpful. I wanted to add some additional observations and patterns from my own workflow that might help others dealing with high token usage.
1. Context snapshot pattern instead of full re-reads
In my case, I usually rely on a context snapshot pattern rather than asking Claude to re-read the entire codebase repeatedly. Instead of requesting full file reads, I ask Claude to:
- Read status / summaries
- Work from previously established context snapshots
This has been very effective in reducing unnecessary token usage.
Additionally, I maintain Development Standards documents and reference them explicitly via claude.md. This allows Claude to anchor decisions to stable documentation instead of re-deriving rules every time.
2. Sub-agents are a major hidden token sink
I’ve now fully switched to not using sub-agents, and I can confirm a noticeable improvement — thanks again for the tip.
From what I’ve observed, sub-agents are extremely expensive because:
- Each sub-agent appears to re-instantiate the full
claude.mdcontext - This happens in addition to the global session context
This quickly multiplies token usage without being obvious to the user.
3. Project-scoped sessions may cause large upfront token spikes
One thing I’m still investigating:
I organize my work with separate sessions per project folder, and I usually copy all relevant documentation into each project directory.
I suspect this might be causing a large upfront token cost when starting a new Claude session — sometimes I see ~45k tokens consumed almost immediately.
If anyone has insights on:
- Sharing context across sessions more efficiently
- Or reducing initial context ingestion costs I’d really appreciate hearing about it.
4. Manual model switching (Haiku / Sonnet / Opus)
I’ve also started actively switching models depending on the task:
- Haiku → very simple questions
- Sonnet → analysis, planning, documentation
- Opus → execution only
This helps, but it’s still manual, and I occasionally forget to switch models.
👉 Question to the community: Is there (or could there be) a way to automatically route requests to a model based on intent (analysis vs execution vs simple Q&A)?
5. Possible Sonnet behavior change
Lastly, this is subjective, but I’ve felt that Sonnet may have been nerfed recently, which indirectly pushes heavier usage toward Opus.
Hopefully there’s an upcoming update or clarification around this, because predictable cost/performance behavior is critical for real workflows.
Thanks again for sharing these findings. Hopefully this thread helps others reduce token burn while we wait for improvements.
@dannydanzka thank you!
📌 Token consumption observations & mitigation strategies (Claude CLI)
Thanks for sharing these tips — they’ve been genuinely helpful. I wanted to add some additional observations and patterns from my own workflow that might help others dealing with high token usage.
1. Context snapshot pattern instead of full re-reads
In my case, I usually rely on a context snapshot pattern rather than asking Claude to re-read the entire codebase repeatedly. Instead of requesting full file reads, I ask Claude to:
- Read status / summaries
- Work from previously established context snapshots
This has been very effective in reducing unnecessary token usage.
Additionally, I maintain Development Standards documents and reference them explicitly via
claude.md. This allows Claude to anchor decisions to stable documentation instead of re-deriving rules every time.2. Sub-agents are a major hidden token sink
I’ve now fully switched to not using sub-agents, and I can confirm a noticeable improvement — thanks again for the tip.
From what I’ve observed, sub-agents are extremely expensive because:
- Each sub-agent appears to re-instantiate the full
claude.mdcontext- This happens in addition to the global session context
This quickly multiplies token usage without being obvious to the user.
3. Project-scoped sessions may cause large upfront token spikes
One thing I’m still investigating:
I organize my work with separate sessions per project folder, and I usually copy all relevant documentation into each project directory.
I suspect this might be causing a large upfront token cost when starting a new Claude session — sometimes I see ~45k tokens consumed almost immediately.
If anyone has insights on:
- Sharing context across sessions more efficiently
- Or reducing initial context ingestion costs I’d really appreciate hearing about it.
4. Manual model switching (Haiku / Sonnet / Opus)
I’ve also started actively switching models depending on the task:
- Haiku → very simple questions
- Sonnet → analysis, planning, documentation
- Opus → execution only
This helps, but it’s still manual, and I occasionally forget to switch models.
👉 Question to the community: Is there (or could there be) a way to automatically route requests to a model based on intent (analysis vs execution vs simple Q&A)?
5. Possible Sonnet behavior change
Lastly, this is subjective, but I’ve felt that Sonnet may have been nerfed recently, which indirectly pushes heavier usage toward Opus.
Hopefully there’s an upcoming update or clarification around this, because predictable cost/performance behavior is critical for real workflows.
Thanks again for sharing these findings. Hopefully this thread helps others reduce token burn while we wait for improvements.
I'm not using any agents, always checking context, and always run clear before plan. And only couple mcp're active. And don't use them too much, 3-4 time during a week.
Yesterday I've asked just create some plan, based on some conditions, it's completed all at the same day and just asking save me plan to some file, it's completed and I've not reached any limit (Session, Week). And the next day I'm just saying yes, save this plan in some suggested .md file, and in just seconds it's eat 7% of the session. It's anormal.
seem it was fixed starting 2.0.72 only it's not starts session with 7% but 4%, so it's progress 👍
seem it was fixed starting
2.0.72only it's not starts session with 7% but 4%, so it's progress 👍
Isn't that still too high? Even when I tried with thinking mode disabled the usage is high. But I'm getting much lower usage when it's with auto compact off. I think it's some bug with how auto compact is currently set up to work.
When Claude start eating a lot of tokens, I've used hybrid way by combining it with ChatGPT5.2 (now it's good in coding as well, but little bit slow), for planning and to consult Claude for coding ChatGPT 5.2 (Claude coding expensive), both 20$ monthly plan. Thanks Claude, for triggering me looking an alternative way 😄
Additionally, I maintain Development Standards documents and reference them explicitly via claude.md.
Note that the CLAUDE.md file is read into context all the time, so while it does help grounding, it is a constant cost in tokens in your context (visible in /context). There can be multiple CLAUDE.md files too, including your user's CLAUDE.md. Repetition across these files will waste tokens too.
Consider keeping a lean context and allowing for progressive enhancement via skills: these can be scoped to your repository or your local machine, and can be pulled in automatically by Claude, or, if that isn't working well enough, you can make Claude more aware of the skill using explicit callouts in CLAUDE.md.
(Note that the model you're using will significantly affect how well your instructions will be followed, so use direct and careful language in your CLAUDE prompts to get value from the tokens.)
Skills can also use progressive disclosure to reduce upfront token usage, aside from their basic usage.
For example, a skill can have a reference subfolder that includes additional instructions in separate markdown files. Your SKILL.md can then act as an index of sorts, giving instructions on when to follow a given reference. These are not @ mentions, which are pulled in straight away, but instead relative references in text, like `references/something-about-mary.md`.
These can be used by sub-agents as well.
I have per-token billing with AWS Bedrock, so I will often use sub-agents for running commands that use the Haiku 4.5 model, as this significantly decreases the cost of reading and understanding command output. For this task too I've used a sub-agent paired with a script that redirects command output to a file for subsequent analysis with tools that read file segments.
Lastly, if you're interested in the base instructions used by Claude (that might contribute to your token usage alongside CLAUDE.md), consider looking at TweakCC as that may open the curtain a bit for you.
@jamestelfer thank you! This is really helpful!