opencode icon indicating copy to clipboard operation
opencode copied to clipboard

Codex model response latency increases significantly as conversation grows

Open Skyline-23 opened this issue 9 hours ago • 1 comments

Problem

When using Codex models (via OAuth authentication), response latency increases significantly as the conversation grows longer. This happens because the entire conversation history is sent with every API request.

Root Cause

After investigating the codebase:

  1. The OpenAI Responses API returns a responseId in providerMetadata.openai.responseId
  2. The SDK supports previousResponseId option (openai-responses-language-model.ts:285)
  3. However, this feature is never actually used - the responseId from responses is not stored, and previousResponseId is never set

Current Flow

Request 1 → Send full history → Get response with responseId (discarded)
Request 2 → Send full history again → Get response with responseId (discarded)
Request N → Send increasingly large history → Slow response

Expected Flow with previous_response_id

Request 1 → Send full history → Get response with responseId (saved)
Request 2 → Send previousResponseId + new message only → Fast response
Request N → Reference previous response → Consistent speed

Impact

  • Codex API calls become progressively slower as conversations grow
  • Poor user experience with long coding sessions
  • Unnecessary bandwidth and compute usage

Proposed Solution

  1. Add responseId field to AssistantMessage schema
  2. Store responseId from providerMetadata when receiving responses
  3. Pass previousResponseId to subsequent requests via providerOptions

Skyline-23 avatar Jan 17 '26 07:01 Skyline-23