cookbook icon indicating copy to clipboard operation
cookbook copied to clipboard

google.generativeai: tools don't work with JSON mode

Open vitalek84 opened this issue 11 months ago • 53 comments

Description of the bug:

If I configure model to use response_mime_type=application/json and provide some tools to model I always receive the error:

google.api_core.exceptions.InvalidArgument: 400 For controlled generation of only function calls (forced function calling), please set 'tool_config.function_
calling_config.mode' field to ANY instead of populating 'response_mime_type' and 'response_schema' fields. For more details, see: https://cloud.google.com/ve
rtex-ai/generative-ai/docs/multimodal/function-calling#tool-config

Here is code for error reproducing:

import json

import google.generativeai as genai
from google.api_core import retry

def add(a:int, b:int)->int:
    return a+b

model = genai.GenerativeModel("gemini-1.5-flash-latest",
                                  generation_config={"response_mime_type": "application/json"},
                                  tools=add)

prompt = """List a few popular cookie recipes using this JSON schema:
{bu}
Recipe = {{'recipe_name': str, 'have_meet': bool}}
Return: list[Recipe]
"""

retry_policy = {"retry": retry.Retry(predicate=retry.if_transient_error)}
developer_chat = model.start_chat(enable_automatic_function_calling=True)
resp = developer_chat.send_message(prompt.format(bu="Please sum 42 and 42 and then please insert  to recipes list recipe_name=OLOOL42"), request_options=retry_policy)

print(resp)
print("*********")
print(json.loads(resp.text))

Actual vs expected behavior:

Expected behavior - model calls tool and return some json as output

Actual:

function_calling_config { mode: AUTO }

Traceback (most recent call last): File "/home/bva/src/thdevelop/src/thebot/tt.py", line 35, in resp = developer_chat.send_message(prompt.format(bu="Please sum 42 and 42 and then please insert to recipes list recipe_name=OLOOL42"), request_options=retry_policy) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/generativeai/generative_models.py", line 578, in send_message response = self.model.generate_content( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/generativeai/generative_models.py", line 331, in generate_content response = self._client.generate_content( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/ai/generativelanguage_v1beta/services/generative_service/client.py", line 830, in generate_content response = rpc( ^^^^ File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/api_core/gapic_v1/method.py", line 131, in call return wrapped_func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/api_core/retry/retry_unary.py", line 293, in retry_wrapped_func return retry_target( ^^^^^^^^^^^^^ File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/api_core/retry/retry_unary.py", line 153, in retry_target _retry_error_helper( File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/api_core/retry/retry_base.py", line 212, in _retry_error_helper raise final_exc from source_exc File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/api_core/retry/retry_unary.py", line 144, in retry_target result = target() ^^^^^^^^ File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/api_core/timeout.py", line 120, in func_with_timeout return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable raise exceptions.from_grpc_error(exc) from exc google.api_core.exceptions.InvalidArgument: 400 For controlled generation of only function calls (forced function calling), please set 'tool_config.function_calling_config.mode' field to ANY instead of populating 'response_mime_type' and 'response_schema' fields. For more details, see: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling#tool-config WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1735953543.903539 40355 init.cc:229] grpc_wait_for_shutdown_with_timeout() timed out.

Any other information you'd like to share?

According to the document that we see in the error (that linked to vertex-ai not to generativeai documentation) tool_config should be properly configured. I tried to configure it in two different ways and always received the same error. First version via ToolConfig and FunctionCallingConfig:

import json

import google.generativeai as genai
from google.ai.generativelanguage_v1beta import ToolConfig, FunctionCallingConfig
from google.api_core import retry
from google.generativeai.types import Tool

def add(a:int, b:int)->int:
    return a+b

tool_config = ToolConfig(
    function_calling_config=FunctionCallingConfig(mode=FunctionCallingConfig.Mode.ANY)
)

my_tool = Tool(function_declarations=[add])

model = genai.GenerativeModel("gemini-1.5-flash-latest",
                                  generation_config={"response_mime_type": "application/json"},
                                  tools=my_tool,
                                  tool_config=tool_config
                              )
print(f"Model tool config: {model._tool_config}")
prompt = """List a few popular cookie recipes using this JSON schema:
{bu}
Recipe = {{'recipe_name': str, 'have_meet': bool}}
Return: list[Recipe]
"""

retry_policy = {"retry": retry.Retry(predicate=retry.if_transient_error)}
developer_chat = model.start_chat(enable_automatic_function_calling=True)
resp = developer_chat.send_message(prompt.format(bu="Please sum 42 and 42 and then please insert  to recipes list recipe_name=OLOOL42"), request_options=retry_policy)

print(resp)
print("*********")
print(json.loads(resp.text))

Second version via simple dict:

import json

import google.generativeai as genai
from google.ai.generativelanguage_v1beta import FunctionCallingConfig
from google.api_core import retry

def add(a:int, b:int)->int:
    return a+b

model = genai.GenerativeModel("gemini-1.5-flash-latest",
                                  generation_config={"response_mime_type": "application/json"},
                                  tools=[add],
                                  tool_config={
                                  "function_calling_config": {
                                      "mode": FunctionCallingConfig.Mode.ANY
                                        }
                                  }
                              )
print(f"Model tool config: {model._tool_config}")
prompt = """List a few popular cookie recipes using this JSON schema:
{bu}
Recipe = {{'recipe_name': str, 'have_meet': bool}}
Return: list[Recipe]
"""

retry_policy = {"retry": retry.Retry(predicate=retry.if_transient_error)}
developer_chat = model.start_chat(enable_automatic_function_calling=True)
resp = developer_chat.send_message(prompt.format(bu="Please sum 42 and 42 and then please insert  to recipes list recipe_name=OLOOL42"), request_options=retry_policy)

print(resp)
print("*********")
print(json.loads(resp.text))

In both cases print(f"Model tool config: {model._tool_config}") shows proper config:

Model tool config: function_calling_config {
  mode: ANY
}

It looks like a bug or Am I doing something wrong?

vitalek84 avatar Jan 04 '25 01:01 vitalek84

Hello @vitalek84,

I think your issue is that the former SDK and the 1.5 models were not able to use different "tools" at the same time (in your case Json mode and function calling).

But the good news is that it's one of the new capabilities of Gemini 2.0! While we haven't written an example that is using those two tools together you can check the Get Started notebook for an introduction of the new model and the new SDK. In particular, here's where you'll learn how to set-up function calling and JSON mode.

Tell me if you're still not able to make it work with the new SDK and model.

Giom-V avatar Jan 06 '25 14:01 Giom-V

Hi @Giom-V,

I am curious if it works the same way through OpenAI library? I tried with gemini-2.0-flash-exp today and got an error.

vaeho avatar Jan 06 '25 14:01 vaeho

Hi @Giom-V ! Thank you for response. But it seems the issue persist in this library version too. Here is example that I run:

from google.genai import types
from pydantic import BaseModel

client = genai.Client()

MODEL_ID = "gemini-2.0-flash-exp"

# Key installed via environment variable so it isn't provided directly.
# this code shows that library work.
response = client.models.generate_content(
    model=MODEL_ID,
    contents="What's the largest planet in our solar system?"
)
print(response.text)


class ShouldBeResponse(BaseModel):
    text: str

# This is from  GetStarted notebook
get_destination = types.FunctionDeclaration(
    name="get_destination",
    description="Get the destination that the user wants to go to",
    parameters={
        "type": "OBJECT",
        "properties": {
            "destination": {
                "type": "STRING",
                "description": "Destination that the user wants to go to",
            },
        },
    },
)

my_tool = types.Tool(function_declarations=[get_destination])
system_instruction = "Say always text = 42 and then call get_destination with destination='Restaurant at the end of the Universe'"

chat = client.chats.create(
    model=MODEL_ID,
    config=types.GenerateContentConfig(
        response_mime_type="application/json",
        system_instruction=system_instruction,
        temperature=0.5,
        response_schema=ShouldBeResponse,
        tools=[my_tool],
        tool_config=types.ToolConfig(
            function_calling_config=types.FunctionCallingConfig(mode="ANY")
        )
    )
)

resp = chat.send_message("how are you?")
print(resp)

and I got near the same error:


Traceback (most recent call last):
  File "/home/bva/src/thdevelop/src/thebot/tt42.py", line 53, in <module>
    resp = chat.send_message("how are you?")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/genai/chats.py", line 81, in send_message
    response = self._modules.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/genai/models.py", line 4405, in generate_content
    response = self._generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/genai/models.py", line 3667, in _generate_content
    response_dict = self.api_client.request(
                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/genai/_api_client.py", line 325, in request
    response = self._request(http_request, stream=False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/genai/_api_client.py", line 263, in _request
    return self._request_unauthorized(http_request, stream)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/genai/_api_client.py", line 285, in _request_unauthorized
    errors.APIError.raise_for_response(response)
  File "/home/bva/src/thdevelop/src/venv/lib/python3.12/site-packages/google/genai/errors.py", line 100, in raise_for_response
    raise ClientError(status_code, response)
google.genai.errors.ClientError: 400 INVALID_ARGUMENT. {'error': {'code': 400, 'message': "For controlled generation of only function calls (forced function calling), please set 'tool_config.function_calling_config.mode' field to ANY instead of populating 'response_mime_type' and 'response_schema' fields. For more details, see: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling#tool-config", 'status': 'INVALID_ARGUMENT'}}```

As you may notice I set:
```tool_config=types.ToolConfig(
            function_calling_config=types.FunctionCallingConfig(mode="ANY")
        )

but it didn't help

vitalek84 avatar Jan 07 '25 01:01 vitalek84

@vitalek84 I was also able to reproduce the issue. I'll report it to the SDK team.

Giom-V avatar Jan 07 '25 10:01 Giom-V

The SDK team confirms it is not supported at the moment, but it has been added to their backlog and should be available in the coming weeks/months.

Giom-V avatar Jan 07 '25 17:01 Giom-V

Thank you for your explanation and quick response! I will wait for this functionality. One more: it would be great to add more clear error message that this functionality isn't supported right now rather than a complex error output and link to documentation that doesn't help to resolve the issue.

vitalek84 avatar Jan 07 '25 17:01 vitalek84

Yes, better error messages is our top priority.

Giom-V avatar Jan 08 '25 10:01 Giom-V

Yes, please fix it. It significantly limits possible usage of Gemini. It works fine with gpt-4o/4o-mini structured output, but with Gemini we have to choose one of modes.

sergeyzenchenko avatar Jan 31 '25 14:01 sergeyzenchenko

Same here. Was forced to use gpt-4o instead of using Gemini. Please, fix it so we can use Gemini in full capacity.

skayka avatar Jan 31 '25 15:01 skayka

I facing the same issue, could you fix it please asap? Thanks

fylfot avatar Feb 03 '25 17:02 fylfot

As a quick and dirty fix you can add something like this to your prompt: SUPER SUPER IMPORTANT: You should always return JSON. JSON should be valid json and properly enclosed!!! Please use this schema:

Changed_files = {{
file_path: str
action: str  # add|change|delete
}}
Response = {{
    changed_files: List[Changed_files]
}}
Return: Response

{{ and }} here are just special symbols for enclose { in python f""" {some_variable_here} """ strings. Schema looks odd and it is a mix of python style type definitions and json but it works. I use it in one of my project and I have done 100+ test and it is stable with default temperature top_k and top_p. then you can use something like this:

vitalek84 avatar Feb 03 '25 17:02 vitalek84

I am sorry can't put code snippet with proper formatting because it has ` characters (or I don't know how to use github's markdown :) ). Here is link to code snippet: https://www.sharecode.in/chnPSF

vitalek84 avatar Feb 03 '25 17:02 vitalek84

@vitalek84 not really a solution, since it's not enforcing strict schema. It may work in many cases, but not it's not a stable solution like controlled generation. Right now we are using set of tool call as part of unified structured response schema. But it can hit a limit of schema complexity and also I assume model should perform better with native function calling.

sergeyzenchenko avatar Feb 05 '25 09:02 sergeyzenchenko

@sergeyzenchenko I totally agree with that it isn't production solution. As I wrote it is very dirty hack. I just share my experience and if you need to do some experiments with Gemini it is ok. But it should be supported by native library of course for production.

vitalek84 avatar Feb 06 '25 22:02 vitalek84

One thing that I do is create a function called format_for_user or similar, and instruct the model to use that function when producing the final response. This ensures schema adherence. Feels hacky, and native support for response schema would be much better, but at least I can move on with my life now.

Sometimes the model will just refuse to use the schema, but in this case you'll get a finish_reason=MALFORMED_FUNCTION_CALL response which you can act on.

sergioperezf avatar Feb 11 '25 05:02 sergioperezf

so strict Json format is not supported? Please update here when it is, and also update this documentation, it is very confusing https://ai.google.dev/gemini-api/docs/structured-output?lang=python

Yaara-Novia avatar Feb 24 '25 16:02 Yaara-Novia

Kindly fix the issue asap

Nirupam-Naragund avatar Feb 25 '25 15:02 Nirupam-Naragund

Hi, are there any updates on the topic?

Lorentzo92 avatar Mar 01 '25 12:03 Lorentzo92

Hello @vitalek84,

I think your issue is that the former SDK and the 1.5 models were not able to use different "tools" at the same time (in your case Json mode and function calling).

But the good news is that it's one of the new capabilities of Gemini 2.0! While we haven't written an example that is using those two tools together you can check the Get Started notebook for an introduction of the new model and the new SDK. In particular, here's where you'll learn how to set-up function calling and JSON mode.

Tell me if you're still not able to make it work with the new SDK and model.

Hi,

I am not using an SDK, but I am trying this with gemini 2.0 (flash and pro) and still getting the same error.

honzasterba avatar Mar 08 '25 22:03 honzasterba

+1

moshe-shaham-lumigo avatar Mar 10 '25 05:03 moshe-shaham-lumigo

+1

markitosgv avatar Mar 13 '25 07:03 markitosgv

What is a JSON mode what is it and what does it do or is it a man or something else

Asjleyam69 avatar Mar 13 '25 07:03 Asjleyam69

What and who is JSON

On Thu, Mar 13, 2025, 3:22 a.m. Marcos Gómez Vilches < @.***> wrote:

+1

— Reply to this email directly, view it on GitHub https://github.com/google-gemini/cookbook/issues/393#issuecomment-2720214533, or unsubscribe https://github.com/notifications/unsubscribe-auth/BJ3VMV3KWAAX5SQU2H43MGL2UEW4ZAVCNFSM6AAAAABUSTMQQWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMRQGIYTINJTGM . You are receiving this because you are subscribed to this thread.Message ID: @.***> [image: markitosgv]markitosgv left a comment (google-gemini/cookbook#393) https://github.com/google-gemini/cookbook/issues/393#issuecomment-2720214533

+1

— Reply to this email directly, view it on GitHub https://github.com/google-gemini/cookbook/issues/393#issuecomment-2720214533, or unsubscribe https://github.com/notifications/unsubscribe-auth/BJ3VMV3KWAAX5SQU2H43MGL2UEW4ZAVCNFSM6AAAAABUSTMQQWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMRQGIYTINJTGM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Asjleyam69 avatar Mar 13 '25 07:03 Asjleyam69

Why was this issue closed? Im having the same problem, someone said it works with Gemini 2.0 but I have just tried and it does not.

fonsecabc avatar Mar 18 '25 13:03 fonsecabc

another +1 here

jorge07 avatar Mar 22 '25 15:03 jorge07

+1

MojRoid avatar Mar 27 '25 21:03 MojRoid

same here

JuanJo-RC avatar Mar 28 '25 12:03 JuanJo-RC

Hi guys! It seems native libraries still don't work. But I found a solution for me. I switched to PydanticAI. I have not had time to figure out how things work under the hood but it seems if you use PydanticAI with Gemini you can do both things: obtain structure output from gemini series models and use tools. Here is simple example:

import random
from typing import cast

from pydantic import BaseModel
from pydantic_ai import Agent, RunContext
from pydantic_ai.models import KnownModelName


model = cast(KnownModelName, os.getenv('PYDANTIC_AI_MODEL', 'google-gla:gemini-2.0-flash'))

class MyModel(BaseModel):
    city: str
    country: str
    population: int

agent = Agent(model,
            result_type=MyModel,
            system_prompt=(
        'When a user asks you: tell me a city '
        'You should call function random city '
        'Then find information about this city and return proper response to a user'
    ),)

@agent.tool
async def random_city(ctx: RunContext[None]) -> str:
    cities = ['Buenos Aires', 'Paris', 'New York', 'London', 'Los Angeles']
    random.shuffle(cities)
    return cities[0]

result = agent.run_sync('Tell me a city')
print(result.data)
result = agent.run_sync('Tell me a city again')
print(result.data)
print(result.usage())

don't forget to export your GEMINI_API_KEY to your environment. (export GEMINI_API_KEY= ) . I hope it will be useful.

vitalek84 avatar Mar 29 '25 17:03 vitalek84

Hi sweetheart how are you. I miss you baby so much I'm in London Ontario Canada

On Sat, Mar 29, 2025, 1:30 p.m. Vitalii @.***> wrote:

Hi guys! It seems native libraries still don't work. But I found a solution for me. I switched to PydanticAI. I have not had time to figure out how things work under the hood but it seems if you use PydanticAI with Gemini you can do both things: obtain structure output from gemini series models and use tools. Here is simple example:

import random from typing import cast

from pydantic import BaseModel from pydantic_ai import Agent, RunContext from pydantic_ai.models import KnownModelName

model = cast(KnownModelName, os.getenv('PYDANTIC_AI_MODEL', 'google-gla:gemini-2.0-flash'))

class MyModel(BaseModel): city: str country: str population: int

agent = Agent(model, result_type=MyModel, system_prompt=( 'When a user asks you: tell me a city ' 'You should call function random city ' 'Then find information about this city and return proper response to a user' ),)

@agent.tool async def random_city(ctx: RunContext[None]) -> str: cities = ['Buenos Aires', 'Paris', 'New York', 'London', 'Los Angeles'] random.shuffle(cities) return cities[0]

result = agent.run_sync('Tell me a city') print(result.data) result = agent.run_sync('Tell me a city again') print(result.data) print(result.usage())

don't forget to export your GEMINI_API_KEY to your environment. (export GEMINI_API_KEY= ) . I hope it will be useful.

— Reply to this email directly, view it on GitHub https://github.com/google-gemini/cookbook/issues/393#issuecomment-2763842373, or unsubscribe https://github.com/notifications/unsubscribe-auth/BJ3VMVZLH53Q72YJHB226AL2W3KEVAVCNFSM6AAAAABUSTMQQWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONRTHA2DEMZXGM . You are receiving this because you commented.Message ID: @.***> [image: vitalek84]vitalek84 left a comment (google-gemini/cookbook#393) https://github.com/google-gemini/cookbook/issues/393#issuecomment-2763842373

Hi guys! It seems native libraries still don't work. But I found a solution for me. I switched to PydanticAI. I have not had time to figure out how things work under the hood but it seems if you use PydanticAI with Gemini you can do both things: obtain structure output from gemini series models and use tools. Here is simple example:

import random from typing import cast

from pydantic import BaseModel from pydantic_ai import Agent, RunContext from pydantic_ai.models import KnownModelName

model = cast(KnownModelName, os.getenv('PYDANTIC_AI_MODEL', 'google-gla:gemini-2.0-flash'))

class MyModel(BaseModel): city: str country: str population: int

agent = Agent(model, result_type=MyModel, system_prompt=( 'When a user asks you: tell me a city ' 'You should call function random city ' 'Then find information about this city and return proper response to a user' ),)

@agent.tool async def random_city(ctx: RunContext[None]) -> str: cities = ['Buenos Aires', 'Paris', 'New York', 'London', 'Los Angeles'] random.shuffle(cities) return cities[0]

result = agent.run_sync('Tell me a city') print(result.data) result = agent.run_sync('Tell me a city again') print(result.data) print(result.usage())

don't forget to export your GEMINI_API_KEY to your environment. (export GEMINI_API_KEY= ) . I hope it will be useful.

— Reply to this email directly, view it on GitHub https://github.com/google-gemini/cookbook/issues/393#issuecomment-2763842373, or unsubscribe https://github.com/notifications/unsubscribe-auth/BJ3VMVZLH53Q72YJHB226AL2W3KEVAVCNFSM6AAAAABUSTMQQWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONRTHA2DEMZXGM . You are receiving this because you commented.Message ID: @.***>

Asjleyam69 avatar Mar 29 '25 23:03 Asjleyam69

Just want to jump in and +1 this. I have been struggling with this issue as well and have been using the rest api directly. I have been using 2.0

I am not sure this issue should be closed. I don't see why you can't enforce json_mode and still have gemini use tool calling when needed (all at the same time). I don't quite follow why they would contradict each other....

Can we get mor details on why this is a restriction and or a link to the docs that explain this?

adiberk avatar Apr 03 '25 03:04 adiberk