Feature Request: Configurable context limit and auto-compaction threshold

Open LeekJay opened this issue 2 days ago • 2 comments

Problem

Currently:

Auto-compaction only triggers when the context is nearly full (count > context - output)
Context limit is determined by the model and cannot be customized

For users who want to optimize costs, there's no way to:

Trigger compaction earlier
Limit the maximum context usage (e.g., use only 100k of a 200k model)

Proposed Solution

Add configurable options for context management:

{
  "compaction": {
    "auto": true,
    "prune": true,
    "threshold": 0.7,
    "maxContext": 100000
  }
}

threshold: A percentage (0.0 - 1.0) of the context limit at which auto-compaction should trigger. Default: 1.0 (current behavior)
maxContext: Maximum context tokens to use, overriding the model's default limit. Default: model's context limit

Implementation Suggestion

In compaction.ts, modify the isOverflow function:

export async function isOverflow(input: { tokens: MessageV2.Assistant["tokens"]; model: Provider.Model }) {
  const config = await Config.get()
  if (config.compaction?.auto === false) return false
  
  // Allow user to override model's context limit
  const modelContext = input.model.limit.context
  const maxContext = config.compaction?.maxContext ?? modelContext
  const context = Math.min(modelContext, maxContext)
  
  if (context === 0) return false
  const count = input.tokens.input + input.tokens.cache.read + input.tokens.output
  const output = Math.min(input.model.limit.output, SessionPrompt.OUTPUT_TOKEN_MAX) || SessionPrompt.OUTPUT_TOKEN_MAX
  
  // Apply threshold (default to 1.0 for backward compatibility)
  const threshold = config.compaction?.threshold ?? 1.0
  const usable = (context * threshold) - output
  
  return count > usable
}

Use Cases

Cost optimization: Limit context to 100k even when using a 200k model
Earlier compaction: Trigger compaction at 60-80% context usage
Predictable behavior: Avoid hitting context limits unexpectedly
Budget control: Prevent unexpectedly large API bills from long sessions

Jan 13 '26 05:01 LeekJay