langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Add thinking_mode, thinking_budget to ChatGoogleAI

Open nallwhy opened this issue 10 months ago • 3 comments

Google’s API behavior is a bit inconsistent.

Except for gemini-2.5-pro and gemini-2.5-flash, providing a thinkingConfig results in an error. Even with gemini-2.5-pro, the thinkingBudget inside thinkingConfig is ignored. For gemini-2.5-flash, it seems to automatically apply a default thinkingBudget of 1024 when the value not exists.

Because of this inconsistency, it’s difficult to handle thinkingConfig properly without knowing the exact model. In this PR, thinkingConfig is only included when thinking_mode: true to avoid triggering errors on models that don’t support it.

However, gemini-2.5-flash actually does use a default thinkingBudget even when thinking_mode: false, meaning it behaves as if it’s in “thinking” mode by default. This could lead to behavior that differs from what users expect.

One way to handle this would be to add model-specific branching based on the model name, but since the codebase hasn’t had such logic before, I’m not sure whether to introduce it here.

Would love to hear your thoughts on this.

nallwhy avatar Apr 22 '25 03:04 nallwhy

Hi @nallwhy! Oh, Google. :disappointed:

What I do in my apps, is I can pattern match on a model name to get the Chat module setup I want. Like:

def setup_model("anthropic" <> _rest = model) do
  # ...
end

Documenting an example of how people should use it would probably be the most helpful. Otherwise, as soon as there's a new gemini model, the library may be making the wrong choice and blocking them from what they need to do.

Add a basic test and some docs to the module doc so people know it's there and how to use it and we'll get it in!

brainlid avatar Apr 23 '25 00:04 brainlid

@nallwhy Oh, BTW, I've got a branch that's about to become v0.4.0-rc.0 that introduces breaking changes in order to broadly add "thinking" support for models.

My changes focus on Assistant's being able to provide ContentPart for their responses, and converting all received messages to a list of ContentPart to help make it more uniform.

I only add support for OpenAI and Anthropic, but all the other functions and supporting code is updated.

Depending on how Gemini returns thinking blocks, you may want to use that approach.

brainlid avatar Apr 23 '25 01:04 brainlid

@brainlid Thanks! I’ll continue this work after 0.4.0 is released. I’m planning to: • Manage the application of thinking_mode based on model name, similar to how you use setup_model/1 • Convert thinking blocks into ContentPart for better consistency

nallwhy avatar Apr 23 '25 01:04 nallwhy

#354

nallwhy avatar Aug 05 '25 03:08 nallwhy