feat: add a universal API for reasoning
- [x] I have looked for existing issues (including closed) about this
Feature Request
Allow reasoning to be turned on and configured through provider-agnostic AgentBuilder methods.
Motivation
Currently, enabling reasoning in rig is very provider-specfic with configuration usually being done through the miscellaneous additional_params method. This makes writing multi-provider AI software more difficult as the user of the library must configure each provider seperately.
Proposal
This feature might be a little hard to implement as almost all providers have very different configuration methods for reasoning, with some providers not supporting it at all. However, I feel like most configuration should be abstractable into generic methods, reserving untranslatable provider-exclusive options for additional_params (e.g. Gemini's reasoning budget).
I think this is worth adding but there is a lot of varying provider implementations. Ollama for example has a simple toggle to either turn it on or off while bigger providers like OpenAI have a full implementation for it, so will definitely need to think about how to implement it.
configuration usually being done through the miscellaneous additional_params method
It's even worse on ollama, think is in the main request, not in options, while additional_params gets merged into options. So you effectively can't disable thinking, not even with a Modelfile.
configuration usually being done through the miscellaneous additional_params method
It's even worse on ollama,
thinkis in the main request, not inoptions, whileadditional_paramsgets merged intooptions. So you effectively can't disable thinking, not even with a Modelfile.
Yeah to be honest I am not really sure how to deal with this.
We could potentially add a generic (and do the config that way), but I would prefer the library to not become a giant mess of generics
edit: Sorry, misread your message - yes, it seems like this is actually an issue. This should really be fixed by #496... but I'm still not sure why they made the choice to put think in the main request. I'll push an update to get this over the line.
However, as to how to fix the issue at large we'll probably need to think about it.
This is part of a larger conversation of provider specific options. For my part here is what I did in our codebase. I used the budget calculation of openrouter.
pub trait CompletionModelDynExt: CompletionModelDyn {
fn reasoning(
&self,
reasoning: AIReasoningRequest,
max_tokens: Option<u64>,
) -> Result<Option<serde_json::Value>>;
fn supports_reasoning(&self) -> bool {
true
}
}
#[derive(Serialize)]
struct Thinking {
pub r#type: &'static str,
pub budget_tokens: u64,
}
impl CompletionModelDynExt for rig::providers::anthropic::completion::CompletionModel {
fn reasoning(
&self,
reasoning: AIReasoningRequest,
max_tokens: Option<u64>,
) -> Result<Option<serde_json::Value>> {
let max_tokens = max_tokens
.or(self.default_max_tokens)
.ok_or(RigError::MaxTokensNotSet)?;
let thinking = Thinking {
r#type: "enabled",
budget_tokens: reasoning.effort.budget_tokens(max_tokens),
};
Ok(Some(json!({
"thinking": thinking,
})))
}
}
#[derive(Clone, Debug)]
pub struct AIReasoningRequest {
pub effort: AIReasoningEffort,
}
#[derive(Clone, Debug, Display)]
#[strum(serialize_all = "snake_case")]
pub enum AIReasoningEffort {
Low,
Medium,
High,
}
impl AIReasoningEffort {
pub fn budget_tokens(&self, max_tokens: u64) -> u64 {
match self {
AIReasoningEffort::Low => max_tokens / 5,
AIReasoningEffort::Medium => max_tokens / 2,
AIReasoningEffort::High => max_tokens / 10 * 8,
}
}
}