Fix: Account for max_tokens in /compact context window calculation

Problem Description

The /compact command fails with an API error when the context window is around 90% full, even though it reports having ~10% available space. The error message is:

Error during compaction: Error: API Error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"input length and `max_tokens` exceed context limit: 198667 + 20000 > 200000, decrease input length or `max_tokens` and try again"},"request_id":"req_011CTTAxjzYcksbfD5ErTeTB"}

Root Cause Analysis

After analyzing the minified code in /Users/lurens/.claude/local/node_modules/@anthropic-ai/claude-code/cli.js, I identified the issue:

The qb function (line 2211) returns the context window size: 200,000 for standard models
The aj function calculates available context percentage but doesn't account for output tokens
The compaction uses maxOutputTokensOverride: lx1 where lx1 = 20000 (line 2210)
When checking if compaction is possible, the system only considers input tokens (~198,667) and reports ~10% available
But the actual API call requires: input_tokens (198,667) + max_tokens (20,000) = 218,667 tokens total
This exceeds the 200,000 limit, causing the API error

Proposed Fix

The aj function needs to account for the output tokens reserved for the response. Here's the conceptual fix:

Current Logic (Problematic)

function aj(A) {
    let B = H9B() - kN6;  // Total available minus buffer
    let Q = pd() ? B : H9B();  // Use auto-compact threshold if enabled
    let Z = Math.max(0, Math.round((Q - A) / Q * 100));  // Percent left
    // ... rest of calculation
}

Fixed Logic

function aj(A) {
    let B = H9B() - kN6;  // Total available minus buffer
    let Q = pd() ? B : H9B();  // Use auto-compact threshold if enabled

    // Account for max_tokens needed for compaction output
    let effectiveAvailable = Q - lx1;  // Subtract 20000 tokens for output
    let Z = Math.max(0, Math.round((effectiveAvailable - A) / effectiveAvailable * 100));

    // Adjust thresholds to account for output space
    let G = effectiveAvailable - _N6;  // Warning threshold
    let Y = effectiveAvailable - xN6;  // Error threshold
    let W = A >= G;
    let I = A >= Y;
    let J = pd() && A >= (B - lx1);  // Auto-compact threshold with output buffer

    return {
        percentLeft: Z,
        isAboveWarningThreshold: W,
        isAboveErrorThreshold: I,
        isAboveAutoCompactThreshold: J
    };
}

Alternative Solutions

Reduce max_tokens for compaction: Change lx1 from 20,000 to 10,000
Dynamic max_tokens: Calculate max_tokens based on available space
More conservative thresholds: Increase the buffer values (kN6, _N6, xN6)

Impact

This fix ensures that:

The /compact command works reliably when context is ~90% full
The percentage calculation accurately reflects usable space
Users get proper warnings before hitting the limit

Testing

To verify the fix:

Fill context to ~90% (shows ~10% available)
Run /compact command
Should complete successfully without API errors

Code References

lx1 = 20000 (line 2210) - max tokens for compaction
qb function (line 2211) - returns context window size
aj function - calculates available context percentage
H9B function - calculates total available tokens
px1 function - performs compaction with maxOutputTokensOverride: lx1

Sep 24 '25 12:09 lurenss

Think the bug where the reserved memory is double counted is fixed now in version claude-sonnet-4-5-20250929 (Claude Code version 2.0.13). For details see closing comment in: https://github.com/anthropics/claude-code/issues/8914

Oct 10 '25 20:10 halso

Found something interesting: Claude Code auto-reads .md files from ~/.claude/ AFTER you compact - meaning your instructions can survive compaction.

Tested with verification codes. After compact, Claude knew codes I never mentioned = proof files auto-loaded.

Impact: You can have persistent context that survives compaction, but only if you manage your file count strategically (found a 5-file limit).

Full investigation: https://gist.github.com/Kevthetech143/f6962aa451253f23aba49175fc49f366

@anthropics Is this intentional? Worth documenting?

Nov 02 '25 16:11 Kevthetech143

This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes.

Dec 10 '25 10:12 github-actions[bot]

[BUG] /compact fails when context ~90% full due to max_tokens not being accounted for

Fix: Account for max_tokens in /compact context window calculation

Problem Description

Root Cause Analysis

Proposed Fix

Current Logic (Problematic)

Fixed Logic

Alternative Solutions

Impact

Testing

Code References