opencode
opencode copied to clipboard
Codex model response latency increases significantly as conversation grows
Problem
When using Codex models (via OAuth authentication), response latency increases significantly as the conversation grows longer. This happens because the entire conversation history is sent with every API request.
Root Cause
After investigating the codebase:
- The OpenAI Responses API returns a
responseIdinproviderMetadata.openai.responseId - The SDK supports
previousResponseIdoption (openai-responses-language-model.ts:285) - However, this feature is never actually used - the
responseIdfrom responses is not stored, andpreviousResponseIdis never set
Current Flow
Request 1 → Send full history → Get response with responseId (discarded)
Request 2 → Send full history again → Get response with responseId (discarded)
Request N → Send increasingly large history → Slow response
Expected Flow with previous_response_id
Request 1 → Send full history → Get response with responseId (saved)
Request 2 → Send previousResponseId + new message only → Fast response
Request N → Reference previous response → Consistent speed
Impact
- Codex API calls become progressively slower as conversations grow
- Poor user experience with long coding sessions
- Unnecessary bandwidth and compute usage
Proposed Solution
- Add
responseIdfield toAssistantMessageschema - Store
responseIdfromproviderMetadatawhen receiving responses - Pass
previousResponseIdto subsequent requests viaproviderOptions