opencode icon indicating copy to clipboard operation
opencode copied to clipboard

perf: optimize Anthropic prompt caching for reduced token costs

Open ormandj opened this issue 1 month ago • 0 comments

Summary

Optimizes Anthropic prompt caching to significantly reduce token costs for Claude models through two key improvements:

  • Tool caching: Adds cache breakpoint after tool definitions
  • System prompt reordering: Places stable, shared content first to maximize cache prefix hits across agents

Why

Anthropic's prompt caching uses prefix matching - the longer the matching prefix, the more tokens are served from cache at 90% discount. The previous implementation had two inefficiencies:

  1. Tools weren't cached: Tool definitions are stable but were re-sent uncached on every request
  2. Agent-specific content came first: This broke cache sharing when switching between agents (e.g., code → explore → code)

Test Results

A/B testing with identical prompts (same session, same agent) comparing legacy vs optimized behavior:

Metric Legacy Optimized Improvement
Cache writes (post-warmup) 18,417 tokens ~10,340 tokens 44% reduction
Effective cost (3rd prompt) 13,021 units 3,495 units 73% reduction
Initial cache write 16,211 tokens 17,987 tokens +11% (expected)
Cache hit rate 100% 100% Same

The slightly larger initial cache write (+11%) is quickly amortized by dramatically fewer cache invalidations in subsequent requests.

Changes

  • New ClaudeCache module with centralized cache control logic and configuration
  • Tool caching via ProviderTransform.applyToolCaching()
  • System prompt reordering: header → custom → provider → agent → environment (stable content first)
  • Configurable via provider.options.cache and agent.cache in opencode.json
  • Environment flags for testing: OPENCODE_LEGACY_CACHE=true reverts to old behavior, OPENCODE_DISABLE_CACHE=true disables all caching
  • 46 unit tests covering cache control application, TTL configuration, and provider compatibility

Configuration (optional)

{
  "provider": {
    "anthropic": {
      "options": {
        "cache": {
          "enabled": true,
          "toolsTtl": "5m",
          "instructionsTtl": "5m"
        }
      }
    }
  }
}

Agents can override provider settings via agent.cache.

ormandj avatar Dec 11 '25 23:12 ormandj