Feature/OpenAI compatible reasoning
Add support for reasoning models with reasoning_content field - Kimi K2, Qwen3, DeepSeekV3.2 hosted wth OpenAI compatible APIs
Overview
This PR adds support for OpenAI-compatible reasoning models that return their thinking process via the reasoning_content field, including DeepSeek, Qwen Thinking, and Kimi K2 Thinking models. Please note, I need this feature for myself. This is primarily why I wrote it, but I'm pushing it upstream in hope it will be useful.
Problem
Several popular reasoning models (DeepSeek-V3, DeepSeek-R1, Qwen3-235B-Thinking, Kimi-K2-Thinking) are served with OpenAI-compatible APIs by some providers but return their reasoning/thinking content in a separate reasoning_content field rather than using OpenAI's native reasoning API format.
Currently, OpenCode either:
- Lost this reasoning content entirely
- Mixed it into the regular response text
- Failed to display it in the proper collapsible UI format
Impact
Especially when the thinking is not displayed it breaks the flow of work by making user sit there and wait a long time for the final response. I found this inconvenient enough as to be unusable with my favourite models.
Solution
This PR introduces a custom AI SDK wrapper (@ai-sdk/openai-compatible-reasoning) that:
- Intercepts streaming chunks from OpenAI-compatible APIs
- Detects
delta.reasoning_contentfields in the response - Transforms them into proper reasoning events (
reasoning-start,reasoning-delta,reasoning-end) - Enables OpenCode to display reasoning in collapsible UI blocks, just like Claude Extended Thinking or OpenAI o1 models
Implementation Details
New Provider Type
Created a new bundled provider @ai-sdk/openai-compatible-reasoning that extends the standard OpenAI-compatible provider with reasoning detection capabilities.
File: packages/opencode/src/provider/sdk/openai-compatible/src/openai-compatible-chat-reasoning-model.ts
The chat model wrapper:
- Extends
OpenAICompatibleChatLanguageModelfrom@ai-sdk/openai-compatible - Adds a TransformStream to intercept raw chunks
- Parses
choices[0].delta.reasoning_contentfrom streaming responses - Emits synthetic reasoning events that OpenCode's processor already handles
- Maintains state to properly emit start/delta/end events
- Fully delegates request handling to base model (preserves multimodal support)
File: packages/opencode/src/provider/sdk/openai-compatible/src/openai-compatible-provider.ts
The provider factory:
-
Chat models: Uses custom
OpenAICompatibleChatWithReasoningLanguageModelwrapper -
Embeddings: Uses official
OpenAICompatibleEmbeddingModeldirectly -
Image generation: Uses official
OpenAICompatibleImageModeldirectly - Maintains full feature parity with the standard provider
Provider Options Support
Added support for reasoning.enabled provider option to enable reasoning output for models that require explicit request parameters (like DeepSeek).
File: packages/opencode/src/provider/transform.ts
When configured, the option is passed through to the API request:
{
"model": "deepseek-ai/DeepSeek-V3.2",
"messages": [...],
"reasoning": {
"enabled": true
}
}
Architecture
Raw API Response (SSE chunks)
↓
OpenAICompatibleChatWithReasoningLanguageModel.doStream()
↓
TransformStream intercepts "raw" chunks
↓
Parses delta.reasoning_content field
↓
Emits: reasoning-start → reasoning-delta(s) → reasoning-end
↓
SessionProcessor receives events (existing code)
↓
Creates MessageV2.ReasoningPart (existing code)
↓
UI displays in collapsible thinking blocks (existing code)
Configuration
Users can now configure reasoning models using the new provider:
Example: DeepSeek V3 via DeepInfra
{
"provider": {
"deepinfra-thinking": {
"npm": "@ai-sdk/openai-compatible-reasoning",
"options": {
"baseURL": "https://api.deepinfra.com/v1/openai",
"reasoning": {
"enabled": true
}
},
"models": {
"deepseek-ai/DeepSeek-V3.2": {
"name": "DeepSeek V3.2"
},
"Qwen/Qwen3-235B-A22B-Thinking-2507": {
"name": "Qwen3 235B Thinking"
}
}
}
}
}
Example: Direct DeepSeek API
{
"provider": {
"deepseek": {
"npm": "@ai-sdk/openai-compatible-reasoning",
"options": {
"baseURL": "https://api.deepseek.com/v1",
"reasoning": {
"enabled": true
}
},
"models": {
"deepseek-chat": {
"name": "DeepSeek Chat"
},
"deepseek-reasoner": {
"name": "DeepSeek Reasoner"
}
}
}
}
}
Supported Models
This implementation works with any OpenAI-compatible model that returns reasoning_content in the response, including:
- DeepSeek: DeepSeek-V3, DeepSeek-R1, deepseek-chat, deepseek-reasoner
- Qwen: Qwen3-235B-A22B-Thinking-2507 and other Qwen thinking models
- Kimi: Kimi-K2-Thinking (moonshotai/Kimi-K2-Thinking)
Feature Completeness
The custom provider maintains full feature parity with the official @ai-sdk/openai-compatible provider:
- Chat models: Custom wrapper with reasoning support + full multimodal input (images, files)
-
Embedding models: Delegated to official
OpenAICompatibleEmbeddingModel -
Image generation: Delegated to official
OpenAICompatibleImageModel - Multimodal chat: Images can be dragged into terminal and sent in messages (unchanged behavior)
How It Works
The reasoning wrapper is a thin response-only layer:
Requests (unchanged):
- Multimodal messages (text + images) pass through directly to the base model
- The base model handles image encoding, URL resolution, and API formatting
- All request functionality works identically to the official provider
Responses (enhanced):
- Intercepts streaming chunks to detect
reasoning_contentfields - Transforms reasoning into proper UI events
- Passes through all other content unchanged
This means:
- Users can send images in chat messages (drag & drop in terminal)
- Models can use text embeddings via
textEmbeddingModel() - Models can generate images via
imageModel() - Reasoning is properly displayed in collapsible UI blocks
Testing
Tested with:
- DeepSeek-V3 via DeepInfra (requires
reasoning.enabled: true) - Qwen3-235B-A22B-Thinking-2507 via DeepInfra (works without explicit enablement)
Notes
- The
reasoning.enabledoption is only required for some models (DeepSeek). Other models (Qwen) return reasoning by default. - The implementation uses OpenCode's existing reasoning display logic.
- Full feature parity: The provider supports all capabilities of the official OpenAI-compatible provider (chat, embeddings, image generation, multimodal input).
- Non-breaking: Only the response stream is intercepted; all request handling is identical to the official provider.
- Documentation is updated as well.
this doesnt do what you think it does
that code explicitly says it is only for github copilot and github copilot doesnt return reasoning under that field anyway
what provider(s) arent getting reasoning sent back? the only place changes are necessary is transform.ts
this doesnt do what you think it does
that code explicitly says it is only for github copilot and github copilot doesnt return reasoning under that field anyway
I think you should've written "did", because what it does now it supports every other provider that delivers models this way :-)
what provider(s) arent getting reasoning sent back?
Deepinfra for example, that is what I use.
the only place changes are necessary is transform.ts
transform.ts handles what we send TO the provider. We need to handle what comes back. The extra parameter was just that, an extra. Adding the handling in this provider was the cleanest, least intrusive way to implement it.
Okay fair, I read this on my phone so apologies, so many people vibe code this stuff not understanding what's going on and it frustrates me, you seem to understand it better so that's my bad for misreading!!
Idk if this is the best approach for us tho, we don't want to have to maintain these hacks on our end if we can help it. The ai-sdk deepseek provider should be handling it for us
Also this should ONLY be used for Copilot provider.
This note is why I had my initial reaction, we aren't going to want to edit it unless there is no other way, I'd like to explore other options before making changes to that code
Are there other providers you noticed having issues? I think the cleanest fix is to go upstream and PR to ai-sdk/deepinfra and then we can update models.dev to track these models
Also this should ONLY be used for Copilot provider.
This note is why I had my initial reaction, we aren't going to want to edit it unless there is no other way, I'd like to explore other options before making changes to that code
Fair enough. Well, I needed it for myself. I wanted to make the smallest, least disruptive change I could (and if you never integrate it, this wil not be difficult to manage for myself).
Idk if this is the best approach for us tho, we don't want to have to maintain these hacks on our end if we can help it. The ai-sdk deepseek provider should be handling it for us
It is not just deepseek models. Deepinfra provides hundreds if not thousands of models. Then I just found out SiliconFlow uses the same thing and local server you can host called vLLM (but I haven't tested these two). So it allows use of very capable models.
Then you have OpenRouter that uses very similar way to send reasoning, so could be made to work with just renaming the field from reasoning_content to reasoning (and removing the empty check on the content field - openrouter sends this as an empty field which currently would not work).
Idk if this is the best approach for us tho, we don't want to have to maintain these hacks on our end if we can help it. The ai-sdk deepseek provider should be handling it for us
Well, that is for you to decide, but just saying, OpenAI API has a life of itself outside OpenAI. We have at least 3 providers now that use it (DeepInfra, SiliconFlow, OpenRouter and quite a few local hosting gateways) . Every one in slightly different way. Perhaps having an openai-compatible highly customizable provider is not such a bad idea. Something to discuss perhaps.
Anyway, for me this is huge. The models it opens up to use are almost as capable as th ealternatives from Anthropic (and now also google, I haven't tried chatgpt 5.2 yet, so I can't speak for that) and in general are a tenth of the price. So this one thing puts opencode on equal footing with gemini cli and claude code in my use case.
However I also would love a feature to hide reasoning data by deafult and open it with a keystroke. That is how it works in claude code and I really like it. But that is a conversation for another day.
Are there other providers you noticed having issues? I think the cleanest fix is to go upstream and PR to ai-sdk/deepinfra and then we can update models.dev to track these models
I do not use deepinfra sdk. I use their OpenAI compatible endpoint. I prefer to be vendor agnostic. This way I can swap to OpenRouter with minimu fuss, or my own local models.
See my other reply, I believe SiliconFlow also uses same format and vLLM - OpenRouter is not the same but similar enough to make work easily if we want to.