feat(provider): add interleaved thinking support for models
Summary
- Add
interleaved_thinkingfield to ModelsDev Model schema to detect models with interleaved thinking capability - Add
interleavedThinkingcapability to provider Model interface for internal representation - Update transform logic to handle the new field mapping with proper default values
- Add comprehensive test coverage for interleaved thinking transformation
What is Interleaved Thinking?
Interleaved thinking is a reasoning approach where large language models alternate between thinking and action/answering steps, rather than following the traditional "think-then-answer" pattern. Instead of generating a long chain of thought followed by a single response, models using interleaved thinking follow a pattern like:
Reason → Tool Call → Observe → Reason → Tool Call → ...
Key Benefits:
- Reduced Latency: Cuts time-to-first-token (TTFT) by over 80% on average compared to traditional chain-of-thought reasoning
- Dynamic Adaptation: Allows models to adjust their strategy based on intermediate results and tool outputs
- Error Reduction: Enables immediate checking of reasoning steps, reducing error propagation in long chains
- Enhanced Transparency: Provides inspectable multi-step thinking through
reasoning_detailsstructures - Better Performance: Shows up to 19.3% improvement in accuracy on complex reasoning tasks
Research & Sources
This implementation is based on current research and industry developments:
- Research Paper: Interleaved Reasoning for Large Language Models via Reinforcement Learning - Shows 80% TTFT reduction and 19.3% accuracy improvement
- Industry Documentation: Novita AI Interleaved Thinking Guide - Practical implementation guidance
- Real-world Adoption: Models like MiniMax-M2 and Kimi-K2-Thinking already support this capability
Technical Changes
- ModelsDev Schema: Added optional
interleaved_thinkingboolean field to detect model capability - Provider Interface: Added optional
interleavedThinkingboolean to Model capabilities - Transform Logic: Updated transformation functions to map between schemas with proper defaults
- Backward Compatibility: Made field optional to ensure existing models continue to work
- Test Coverage: Added tests to verify proper transformation and default handling
Applications
This capability transforms traditional function-calling into agent-level tool use, making it particularly valuable for:
- Complex multi-hop question answering
- Mathematical reasoning
- Logical deduction
- Tool-assisted problem solving
Testing
All existing tests pass, and new test coverage has been added for the interleaved thinking transformation logic. The changes maintain full backward compatibility with existing model configurations.
I think the correct fix is adding the reasoning_details support to the openai compatible provider. We should track the interleveaned thinking boolean per model tho but that should first be done on models.dev
I am going to add interleaved thinking support to our custom ai sdk provider
I think the correct fix is adding the reasoning_details support to the openai compatible provider. We should track the interleveaned thinking boolean per model tho but that should first be done on models.dev
I am going to add interleaved thinking support to our custom ai sdk provider
I tried using the reasoning_details parameter, but it didn't work for many providers; for example, LiteLLM doesn't work, nor does VertexAI (for API kimi and minimax). Instead, I tried passing the reasoning via content, and GPT OSS magically became more competent—it was like night and day for simple local tasks. MiniMax and Kimi also had the same result; before, in their reasoning, they constantly showed "The user asked me....", whereas now, for subsequent messages, they respond to the tool.
Ah okay that's a good point. Hm okay I'll do some more research and we will talk internally about this problem in a few hrs. I do see why this fix works, it does feel a bit like a hack but very thankful for you bringing this to my attention i will keep u posted
Instead, I tried passing the reasoning via content, and GPT OSS magically became more competent
How did you do this? I am seeing the exact same thing that you are reporting: Eg, each reasoning message starts with "The user asks me..." instead of the model continuing where it left off.
Hi @rekram1-node
I’ve seen the PR about “better interleaved thinking” (#5298 ) but I can confirm that it still doesn’t work on LiteLLM Proxy.
Since I use many models from different providers, the only appropriate way for me to manage the situation and track costs is by using the litellm proxy; the problem is not limited to litellm alone, even querying llama.cpp directly, the reasoning is not passed back to the model.
In practice it seems that to ensure greater compatibility it would be better to include, in addition to the reasoning_content and reasoning_details fields, the content field.
Let me know what you think.
there were 2 different interleaved thinking prs, what format does litellm expect?
can u not define this in your opencode.json? we can add more mapping options but if all your models are being defined by u you should be able to specify which data to send back
there were 2 different interleaved thinking prs, what format does litellm expect?
can u not define this in your opencode.json? we can add more mapping options but if all your models are being defined by u you should be able to specify which data to send back
It seems that nowadays there is no standard that all providers adhere to for interleaved thinking support, so everyone implements whatever version they like, and others don't even implement it at all.
That is why, in my opinion, it would be truly useful if OpenCode (and generally any LLM client) offered a certain degree of provider customization.
So, in the specific case of models on LiteLLM, it seems you have to pass it using content, but in others that support the OpenAI schema, you need to use specific fields like reasoning_content and reasoning_details though.
I saw merged PR #5207; could this be useful in any way for creating provider-specific plugins without messing up the configuration?
We can add/expand the interleaved thinking configuration supports, but i don't think we should be converting all reasoning chunks to text parts, if there is a specific provider that requires it then maybe but so far all the providers that'd want it that way (that I've seen) will already send the reasoning chunks back as assistant messages with the
@rekram1-node
I can confirm that the support for interleaving thinking with the parameter you implemented works by specifying the field reasoning_details or reasoning_content. My issue was that I was simply passing interleaved: true and it didn’t always work with all models; instead, specifying the field works even with litellm and other providers. For me, I can close this PR because a clearly better version has been implemented and mine was just a workaround. Perhaps simply updating the documentation about it could be marginally useful.
Sweet