feat: expose rate limit information in APIError

Open JonHolman opened this issue 1 week ago • 1 comments

Rate Limit Information Enhancement

Summary

This PR adds detailed rate limit information to OpenCode's APIError schema, exposing provider-specific rate limiting details that were previously captured by the server but not accessible to SDK clients.

Problem

When OpenCode encounters rate limits from AI providers (Anthropic, Google, etc.), it captures detailed information in debug logs including:

Retry-after delays
Reset timestamps
Quota status and utilization
Provider-specific error details

However, this information was not exposed through the API, forcing applications to:

Use generic timeout handling
Implement blind retry strategies
Miss opportunities for intelligent rate limit management

Solution

Enhanced the MessageV2.APIError schema with a new rateLimitInfo field containing:

rateLimitInfo?: {
  retryAfter?: number          // milliseconds until retry allowed
  resetTime?: number           // Unix timestamp when limit resets  
  quotaStatus?: string         // e.g., "rejected", "allowed"
  quotaUtilization?: number    // percentage 0-1
  quotaDetails?: Record<string, any>  // provider-specific details
}

Implementation Details

1. Schema Enhancement (lines 34-41)

Added rateLimitInfo field to APIError Zod schema as an optional object with 5 optional sub-fields.

2. Helper Function (lines 611-805)

extractRateLimitInfo(error: APICallError) extracts rate limit details from:

Standard Headers:

retry-after: Supports both seconds (number) and HTTP-date formats

Anthropic-specific Headers:

anthropic-ratelimit-unified-reset: Reset timestamp
anthropic-ratelimit-unified-status: "rejected" or "allowed"
anthropic-ratelimit-unified-utilization: 0-1 percentage

Google Response Body:

QuotaFailure: Violations with metrics, limits, and dimensions
RetryInfo: Retry delays in "Xs" format

3. Integration (line 876)

Updated fromError() to call extractRateLimitInfo(e) and include result in APIError construction.

Testing

Validated with actual rate limit responses:

Anthropic 429:

retry-after: 32692
anthropic-ratelimit-unified-reset: 1767884400
anthropic-ratelimit-unified-status: rejected  
anthropic-ratelimit-unified-utilization: 1.00003

Google RESOURCE_EXHAUSTED:

{
  "error": {
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.QuotaFailure",
        "violations": [{
          "quotaMetric": "character_count",
          "quotaLimit": "CharacterCountPerDay"
        }]
      },
      {
        "@type": "type.googleapis.com/google.rpc.RetryInfo",
        "retryDelay": "22.362s"
      }
    ]
  }
}

Benefits

Intelligent Retry Logic: Applications can wait exactly the required time instead of guessing
Quota Awareness: See when quotas reset and current utilization levels
Better Error Messages: Surface provider-specific details to users
Resource Efficiency: Avoid hammering rate-limited endpoints
Multi-Provider Support: Unified interface for different provider formats

Example Usage

const result = await opencode.sessions.execute(/* ... */);

if (result.error?.type === 'APIError' && result.error.rateLimitInfo) {
  const { retryAfter, resetTime, quotaStatus, quotaUtilization } = result.error.rateLimitInfo;
  
  console.log(`Rate limited. Retry in ${retryAfter}ms`);
  console.log(`Quota resets at ${new Date(resetTime)}`);
  console.log(`Status: ${quotaStatus}, Utilization: ${quotaUtilization * 100}%`);
  
  // Wait and retry
  await new Promise(resolve => setTimeout(resolve, retryAfter));
  // ... retry logic
}

Backwards Compatibility

✅ Non-breaking: New field is optional
✅ Existing code: Continues to work unchanged
✅ Opt-in: Applications can choose to use the new field
✅ Type-safe: Full TypeScript support

Files Changed

packages/opencode/src/session/message-v2.ts

Jan 08 '26 06:01 JonHolman