rig icon indicating copy to clipboard operation
rig copied to clipboard

feat: add a universal API for reasoning

Open geodic opened this issue 2 months ago • 5 comments

  • [x] I have looked for existing issues (including closed) about this

Feature Request

Allow reasoning to be turned on and configured through provider-agnostic AgentBuilder methods.

Motivation

Currently, enabling reasoning in rig is very provider-specfic with configuration usually being done through the miscellaneous additional_params method. This makes writing multi-provider AI software more difficult as the user of the library must configure each provider seperately.

Proposal

This feature might be a little hard to implement as almost all providers have very different configuration methods for reasoning, with some providers not supporting it at all. However, I feel like most configuration should be abstractable into generic methods, reserving untranslatable provider-exclusive options for additional_params (e.g. Gemini's reasoning budget).

geodic avatar Oct 19 '25 18:10 geodic

RIG-1002

linear[bot] avatar Oct 19 '25 18:10 linear[bot]

I think this is worth adding but there is a lot of varying provider implementations. Ollama for example has a simple toggle to either turn it on or off while bigger providers like OpenAI have a full implementation for it, so will definitely need to think about how to implement it.

joshua-mo-143 avatar Oct 19 '25 20:10 joshua-mo-143

configuration usually being done through the miscellaneous additional_params method

It's even worse on ollama, think is in the main request, not in options, while additional_params gets merged into options. So you effectively can't disable thinking, not even with a Modelfile.

lnicola avatar Oct 23 '25 16:10 lnicola

configuration usually being done through the miscellaneous additional_params method

It's even worse on ollama, think is in the main request, not in options, while additional_params gets merged into options. So you effectively can't disable thinking, not even with a Modelfile.

Yeah to be honest I am not really sure how to deal with this.

We could potentially add a generic (and do the config that way), but I would prefer the library to not become a giant mess of generics

edit: Sorry, misread your message - yes, it seems like this is actually an issue. This should really be fixed by #496... but I'm still not sure why they made the choice to put think in the main request. I'll push an update to get this over the line.

However, as to how to fix the issue at large we'll probably need to think about it.

joshua-mo-143 avatar Oct 23 '25 16:10 joshua-mo-143

This is part of a larger conversation of provider specific options. For my part here is what I did in our codebase. I used the budget calculation of openrouter.

pub trait CompletionModelDynExt: CompletionModelDyn {
    fn reasoning(
        &self,
        reasoning: AIReasoningRequest,
        max_tokens: Option<u64>,
    ) -> Result<Option<serde_json::Value>>;

    fn supports_reasoning(&self) -> bool {
        true
    }
}


#[derive(Serialize)]
struct Thinking {
    pub r#type: &'static str,
    pub budget_tokens: u64,
}

impl CompletionModelDynExt for rig::providers::anthropic::completion::CompletionModel {
    fn reasoning(
        &self,
        reasoning: AIReasoningRequest,
        max_tokens: Option<u64>,
    ) -> Result<Option<serde_json::Value>> {
        let max_tokens = max_tokens
            .or(self.default_max_tokens)
            .ok_or(RigError::MaxTokensNotSet)?;

        let thinking = Thinking {
            r#type: "enabled",
            budget_tokens: reasoning.effort.budget_tokens(max_tokens),
        };

        Ok(Some(json!({
            "thinking": thinking,
        })))
    }
}

#[derive(Clone, Debug)]
pub struct AIReasoningRequest {
    pub effort: AIReasoningEffort,
}

#[derive(Clone, Debug, Display)]
#[strum(serialize_all = "snake_case")]
pub enum AIReasoningEffort {
    Low,
    Medium,
    High,
}

impl AIReasoningEffort {
    pub fn budget_tokens(&self, max_tokens: u64) -> u64 {
        match self {
            AIReasoningEffort::Low => max_tokens / 5,
            AIReasoningEffort::Medium => max_tokens / 2,
            AIReasoningEffort::High => max_tokens / 10 * 8,
        }
    }
}

Sytten avatar Nov 21 '25 18:11 Sytten