ai icon indicating copy to clipboard operation
ai copied to clipboard

Google Gemini model fails to use tools properly in follow-up messages (gemini-2.5-pro-preview-03-25)

Open ElectricCodeGuy opened this issue 8 months ago • 20 comments

Description

We're experiencing inconsistent behavior with the Google Gemini model (gemini-2.5-pro-preview-03-25) when used with the Vercel AI SDK's streamText function with tools. The model works correctly for the first user message but fails to properly use tools in follow-up messages within the same conversation.

Our implementation follows the same pattern as the RAG chatbot example in the Vercel AI SDK docs (https://sdk.vercel.ai/docs/guides/rag-chatbot), but with multiple tools covering different information areas.

Environment

  • Vercel AI SDK (latest version)
  • Google Gemini model: gemini-2.5-pro-preview-03-25
  • Next.js application App router API routes

Expected Behavior

The model should consistently select and execute appropriate tools throughout the entire conversation, including follow-up questions that require clarification or additional information from previously used tools.

Actual Behavior

In follow-up messages, the model:

  1. Refuses to use previously used tools that would be appropriate
  2. Claims it will use a tool but doesn't actually execute it
  3. Hallucinates answers instead of retrieving information
  4. Sometimes starts answering without tools, then uses a tool at the very end

Steps to Reproduce

  1. Set up a conversation with the Gemini model using streamText with multiple tools
  2. Send an initial query that successfully uses tools to retrieve information
  3. Send a follow-up question that requires using the same tools again

Additional Information

  • Only occurs with Gemini - Claude models work correctly throughout conversations
  • First message always works as expected with proper tool usage
  • Filtering out tool messages with coreMessage.filter((msg) => msg.role !== 'tool') didn't help
  • Each tool retrieves specific contextual information from different information areas

Is this a known issue with the Gemini integration? Any workarounds available?

ElectricCodeGuy avatar Apr 12 '25 16:04 ElectricCodeGuy

I am getting a similar error with 2.5 pro experimental, but on every call:

Inference Error [Error [AI_APICallError]: * GenerateContentRequest.tools[0].function_declarations[5].parameters.required[2]: p
roperty is not defined
* GenerateContentRequest.tools[0].function_declarations[6].parameters.required[1]: property is not defined
] {
  cause: undefined,
  url: 'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-exp-03-25:streamGenerateContent?alt=sse',
  requestBodyValues: [Object],
  statusCode: 400,
  responseHeaders: [Object],
  responseBody: '{\n' +
    '  "error": {\n' +
    '    "code": 400,\n' +
    '    "message": "* GenerateContentRequest.tools[0].function_declarations[5].parameters.required[2]: property is not define
d\\n* GenerateContentRequest.tools[0].function_declarations[6].parameters.required[1]: property is not defined\\n",\n' +
    '    "status": "INVALID_ARGUMENT"\n' +
    '  }\n' +
  isRetryable: false,
  data: [Object]
}

My tool setup works fine with Claude models

"ai": "4.3.4" "@ai-sdk/google": "1.2.10"

cgilly2fast avatar Apr 14 '25 00:04 cgilly2fast

Im not sure if this is a bug with the package (most likely not) or the AI model. I just find it odd that the best ai model (at least in my use case) can't figure out how to execute a tool when gpt-4o-mini can do it without any problems...

ElectricCodeGuy avatar Apr 14 '25 07:04 ElectricCodeGuy

fromToken: z .optional(z.string().describe("The token address to swap from")) .describe("The token address to swap from"), toToken: z .optional(z.string().describe("The token address to swap to")) .describe("The token address to swap to"), fromChain: z .optional( z.enum(CHAINS).describe("The source chain being bridged from") ) .describe("The source chain being bridged from"), toChain: z .optional( z .enum(CHAINS) .describe("The destination chain being bridged to") ) .describe("The destination chain being bridged to"), amount: z .string() .optional() .describe( "The amount of tokens to swap, scaled down by the token's decimals. Represent as a Number, i.e. '0.248'" ),

For this zod schema the tool call straight up fails

akv2011 avatar Apr 14 '25 12:04 akv2011

Hi @ElectricCodeGuy I tried to reproduce this and wasn't able to. Here's what I tried:

models: gemini-2.5-pro-exp-03-25 & gemini-2.5-pro-preview-03-25

useChat with:

    onToolCall({ toolCall }) {
      console.log('Tool call:', toolCall);
    },

https://github.com/user-attachments/assets/c22c01f9-1663-493e-8c26-d5d0466eadc6

would you be able to provide more details on what specific prompts your using and what tools you have?

samdenty avatar Apr 14 '25 17:04 samdenty

@akv2011 I inputted your schema, what prompt did you use?

Image

samdenty avatar Apr 14 '25 17:04 samdenty

@akv2011 I inputted your schema, what prompt did you use?

Image

Any pormpt if i have this in my tool unless i comment tool with this schema out it leads to an error occurred in the useunified form vercel

akv2011 avatar Apr 14 '25 17:04 akv2011

@cgilly2fast for your issue specifically, I believe you're running into a known limitation with the conversion from json schema to open api. See here, you can paste your full schema if you want too https://sdk.vercel.ai/providers/ai-sdk-providers/google-generative-ai#schema-limitations

samdenty avatar Apr 14 '25 17:04 samdenty

@akv2011 can you provide the tools your using + streamText options, and some messages I can paste in to repro? Screenshots & or recordings showing the issue would help a ton

samdenty avatar Apr 14 '25 17:04 samdenty

https://github.com/user-attachments/assets/3252fc3c-07e6-4371-84d8-84a333a788c1

So it is a "chat to the law and verdicts RAG type application. I have 5 tools that represent a certain area of the law. So 1 tool for Danish law, 1 tool for EU law, 1 tool for court cases and so on.

The first prompt works as expected. It executes 2 tools in parallel. Each of the tools returns data about the area and some instructions on how to format the markdown references.

However, in my follow up question when i ask it to find more court cases about the subject, it first explains that it will use the verdict tool, but it never actually uses it and then it just makes some cases up. Sometimes i will execute the tool after the initial respond and then answer again and sometimes it will execute the tool and stop and sometimes it will not execute any tools and just hallucinate an answer.

My base systempromt is:

  const baseSystemPrompt = `You are a helpfull legal assistant.
- Always start by explaining what you are going to do
- Check your knowledge base before answering any questions.
- Only respond to questions using information from tool calls.
- Respond in the user's language
- Always use between 1-3 tools pr user question
- Current date: ${new Date().toISOString().split('T')[0]}


${selectedBlobs.length > 0 ? `IMPORTANT: User has selected ${selectedBlobs.length} documents: ${selectedBlobs.join(', ')}. You MUST use the searchDocumentsTool to search these documents for relevant information.` : ''}

Select tools based on the legal domain of the question.`;

And each tool also have some information related to how to make the links and context from my vector db.

I have tried many different prompts, both long and very detailed, but after many test I found that a short prompt will have the highest change of the AI model executing tools in follow-up questions.

 const result = streamText({
    model: google('gemini-2.5-pro-preview-03-25'),
    temperature: 0,
    abortSignal,
    system: baseSystemPrompt,
    tools: {
      searchDanishLaw: searchDanishLawTool,
      searchEULaw: searchEULawTool,
      searchSkat: searchSkatTool,
      searchVerdicts: searchVerdictsTool,
      searchHearings: searchHearingsTool,
      searchDocumentsTool: searchDocumentsTool({ userId, selectedBlobs }),
      searchWebsite: websiteSearchTool
    },
    maxSteps: 5, // Allow multiple tool calls
    toolCallStreaming: true, // Enable streaming of tool calls
    experimental_telemetry: {
      isEnabled: true,
      functionId: 'unified-legal-query',
      metadata: {
        userId: userId,
        chatId: CurrentChatSessionId,
        isNewChat: isFirstMessage,
        email: userEmail
      },
      recordInputs: true,
      recordOutputs: true
    },
    messages: filteredMessages,
    experimental_activeTools: [
      'searchDanishLaw',
      'searchEULaw',
      'searchSkat',
      'searchVerdicts',
      'searchHearings',
      ...(selectedBlobs.length > 0 ? ['searchDocumentsTool' as const] : []),
      'searchWebsite'
    ],

And each tool is

export const searchDanishLawTool = tool({
  description:
    'Search the knowledge base of Danish laws, regulations and guidelines for legal information. This tool queries our comprehensive database of current applicable (gældende) legislation and retspraksis (established case law) from the official Danish law database (Retsinformation). Its excellent for finding information about established legislation, but not ideal for the newest changes, proposed laws, or very recent regulations that may not yet be indexed in the knowledge base. Use this when you need authoritative information about existing Danish legislation.',
  parameters: z.object({
    query: z
      .string()
      .describe(
        "An optimized version of the user's question formulated as a search query to find the most relevant Danish law information. Focus on key legal terms, specific law names, or section numbers when available."
      )
  }),
  execute: async (args, { messages }) => {
export const searchEULawTool = tool({
  description:
    'Search our knowledge base of EU laws, regulations, directives, and treaties for legal information. This tool queries our database of official EU legislation documents. IMPORTANT: This tool ONLY contains EU laws and regulatory texts, and does NOT include any EU court cases or judgments (such as those from the European Court of Justice). Use this specifically for queries about EU legislative acts and their content.',
  parameters: z.object({
    query: z
      .string()
      .describe(
        "An optimized version of the user's question formulated as a search query to find the most relevant EU legislation. Focus on specific directive numbers, regulation titles, treaty articles, or key EU legal terminology when available."
      )
  }),
  execute: async (args, { messages }) => {

And the other once are similar explaining what they are

EDIT:

Chnaged to the newest openai model 4.1 and it now behaves as expected

ElectricCodeGuy avatar Apr 14 '25 18:04 ElectricCodeGuy

@samdenty Thanks for your help. I am still getting error although I removed a couple unions and records schema types.

I believe the two functions its griping about in the error messages (tbh not sure I am interpreting the tool path in the error message correctly)

.function_declarations[5]

z.object({
      collection: z
        .enum(allowedCollections)
        .describe('The collection containing the document to update'),
      id: idSchema.describe('The MongoDB ID of the document to update'),
      data: z
        .object({})
        .catchall(z.any())
        .describe('Object containing the fields to update and their new values'),
      depth: z
        .number()
        .int()
        .min(0)
        .max(10)
        .optional()
        .describe('How deeply to populate related documents in the response (default: 1)'),
    })

.function_declarations[6]

z.object({
      collection: z.enum(allowedCollections).describe('The collection to create the document in'),
      data: z
        .object({})
        .catchall(z.any())
        .describe('Object containing the fields and values for the new document'),
      depth: z
        .number()
        .int()
        .min(0)
        .max(10)
        .optional()
        .describe('How deeply to populate related documents in the response (default: 1)'),
    })

cgilly2fast avatar Apr 14 '25 18:04 cgilly2fast

@akv2011 can you provide the tools your using + streamText options, and some messages I can paste in to repro? Screenshots & or recordings showing the issue would help a ton

model: google('gemini-2.5-pro-exp-03-25'), //model: openai.chat("gpt-4o"), messages, tools: { ...matrixMcpTools,

// Client Tools
getDesiredChain: tool({
  description: "Get the desired chain from the user",
  parameters: z.object({}),
}),
getAmount: tool({
  description: "Get the amount of tokens for any operation",
  parameters: z.object({
    maxAmount: z
      .string()
      .optional()
      .describe(
        "The maximum amount (user's balance) that can be entered"
      ),
    tokenSymbol: z
      .string()
      .optional()
      .describe("The token symbol to display"),
  }),
}),
createPerpsOrder: tool({
  description:
    "Create a perps order using the Hyperliquid protocol. All params are optional",
  parameters: z.object({
    market: z
      .string()
      .min(1)
      .optional()
      .describe("The market name (e.g., 'BTC')"),
    size: z.string().min(1).optional().describe("The order size"),
    isBuy: z.boolean().optional().describe("Whether to buy or sell"),
    orderType: z
      .enum(["limit", "market"])
      .optional()
      .describe("The type of order"),
    price: z
      .string()
      .optional()
      .nullable()
      .describe("The order price (required for limit orders)"),
    timeInForce: z
      .enum(["Alo", "Ioc", "Gtc"])
      .optional()
      .describe("Time in force for limit orders"),
  }),
}),
getSwapBridgeData: tool({
  description:
    "Populates swap and/or bridge transaction data for the LiFi widget",
  parameters: z.object({
    fromToken: z
      .optional(z.string().describe("The token address to swap from"))
      .describe("The token address to swap from"),
    toToken: z
      .optional(z.string().describe("The token address to swap to"))
      .describe("The token address to swap to"),
    fromChain: z
      .optional(
        z.enum(CHAINS).describe("The source chain being bridged from")
      )
      .describe("The source chain being bridged from"),
    toChain: z
      .optional(
        z
          .enum(CHAINS)
          .describe("The destination chain being bridged to")
      )
      .describe("The destination chain being bridged to"),
    amount: z
      .string()
      .optional()
      .describe(
        "The amount of tokens to swap, scaled down by the token's decimals. Represent as a Number, i.e. '0.248'"
      ),
  }),
}),

}, when i use this with tool schema this error comes

Image

akv2011 avatar Apr 14 '25 18:04 akv2011

@akv2011 can you provide the tools your using + streamText options, and some messages I can paste in to repro? Screenshots & or recordings showing the issue would help a ton

model: google('gemini-2.5-pro-exp-03-25'), //model: openai.chat("gpt-4o"), messages, tools: { ...matrixMcpTools,

// Client Tools
getDesiredChain: tool({
  description: "Get the desired chain from the user",
  parameters: z.object({}),
}),
getAmount: tool({
  description: "Get the amount of tokens for any operation",
  parameters: z.object({
    maxAmount: z
      .string()
      .optional()
      .describe(
        "The maximum amount (user's balance) that can be entered"
      ),
    tokenSymbol: z
      .string()
      .optional()
      .describe("The token symbol to display"),
  }),
}),
createPerpsOrder: tool({
  description:
    "Create a perps order using the Hyperliquid protocol. All params are optional",
  parameters: z.object({
    market: z
      .string()
      .min(1)
      .optional()
      .describe("The market name (e.g., 'BTC')"),
    size: z.string().min(1).optional().describe("The order size"),
    isBuy: z.boolean().optional().describe("Whether to buy or sell"),
    orderType: z
      .enum(["limit", "market"])
      .optional()
      .describe("The type of order"),
    price: z
      .string()
      .optional()
      .nullable()
      .describe("The order price (required for limit orders)"),
    timeInForce: z
      .enum(["Alo", "Ioc", "Gtc"])
      .optional()
      .describe("Time in force for limit orders"),
  }),
}),
getSwapBridgeData: tool({
  description:
    "Populates swap and/or bridge transaction data for the LiFi widget",
  parameters: z.object({
    fromToken: z
      .optional(z.string().describe("The token address to swap from"))
      .describe("The token address to swap from"),
    toToken: z
      .optional(z.string().describe("The token address to swap to"))
      .describe("The token address to swap to"),
    fromChain: z
      .optional(
        z.enum(CHAINS).describe("The source chain being bridged from")
      )
      .describe("The source chain being bridged from"),
    toChain: z
      .optional(
        z
          .enum(CHAINS)
          .describe("The destination chain being bridged to")
      )
      .describe("The destination chain being bridged to"),
    amount: z
      .string()
      .optional()
      .describe(
        "The amount of tokens to swap, scaled down by the token's decimals. Represent as a Number, i.e. '0.248'"
      ),
  }),
}),

}, when i use this with tool schema this error comes

Image

for this

model: google('gemini-2.5-pro-exp-03-25'), //model: openai.chat("gpt-4o"), messages, tools: { ...matrixMcpTools,

    // Client Tools
    getDesiredChain: tool({
      description: "Get the desired chain from the user",
      parameters: z.object({}),
    }),
    getAmount: tool({
      description: "Get the amount of tokens for any operation",
      parameters: z.object({
        maxAmount: z
          .string()
          .optional()
          .describe(
            "The maximum amount (user's balance) that can be entered"
          ),
        tokenSymbol: z
          .string()
          .optional()
          .describe("The token symbol to display"),
      }),
    }),
    // createPerpsOrder: tool({
    //   description:
    //     "Create a perps order using the Hyperliquid protocol. All params are optional",
    //   parameters: z.object({
    //     market: z
    //       .string()
    //       .min(1)
    //       .optional()
    //       .describe("The market name (e.g., 'BTC')"),
    //     size: z.string().min(1).optional().describe("The order size"),
    //     isBuy: z.boolean().optional().describe("Whether to buy or sell"),
    //     orderType: z
    //       .enum(["limit", "market"])
    //       .optional()
    //       .describe("The type of order"),
    //     price: z
    //       .string()
    //       .optional()
    //       .nullable()
    //       .describe("The order price (required for limit orders)"),
    //     timeInForce: z
    //       .enum(["Alo", "Ioc", "Gtc"])
    //       .optional()
    //       .describe("Time in force for limit orders"),
    //   }),
    // }),
    // getSwapBridgeData: tool({
    //   description:
    //     "Populates swap and/or bridge transaction data for the LiFi widget",
    //   parameters: z.object({
    //     fromToken: z
    //       .optional(z.string().describe("The token address to swap from"))
    //       .describe("The token address to swap from"),
    //     toToken: z
    //       .optional(z.string().describe("The token address to swap to"))
    //       .describe("The token address to swap to"),
    //     fromChain: z
    //       .optional(
    //         z.enum(CHAINS).describe("The source chain being bridged from")
    //       )
    //       .describe("The source chain being bridged from"),
    //     toChain: z
    //       .optional(
    //         z
    //           .enum(CHAINS)
    //           .describe("The destination chain being bridged to")
    //       )
    //       .describe("The destination chain being bridged to"),
    //     amount: z
    //       .string()
    //       .optional()
    //       .describe(
    //         "The amount of tokens to swap, scaled down by the token's decimals. Represent as a Number, i.e. '0.248'"
    //       ),
    //   }),
    // }),
  },

It calls correctly

Image

akv2011 avatar Apr 14 '25 18:04 akv2011

Doesn't matter the prompt

akv2011 avatar Apr 14 '25 18:04 akv2011

z.enum(...): Especially z.enum(CHAINS) in getSwapBridgeData could be complex if the CHAINS array is large. Large enums can sometimes cause issues with schema representation.

z.optional(...)

z.nullable() (in createPerpsOrder)

Detailed .describe() strings.

akv2011 avatar Apr 14 '25 18:04 akv2011

20250414-1802-31.4543100.mp4 So it is a "chat to the law and verdicts RAGA type application. I have 5 tools that represent a certain area of the law. So 1 tool for Danish law, 1 tool for EU law, 1 tool for court cases and so on.

The first prompt works as expected. It executes 2 tools in parallel. Each of the tools returns data about the area and some instructions on how to format the markdown references.

However, in my follow up question when i ask it to find more court cases about the subject, it first explains that it will use the verdict tool, but it never actually uses it and then it just makes some cases up. Sometimes i will execute the tool after the initial respond and then answer again and sometimes it will execute the tool and stop and sometimes it will not execute any tools and just hallucinate an answer.

My base systempromt is:

  const baseSystemPrompt = `You are a helpfull legal assistant.
- Always start by explaining what you are going to do
- Check your knowledge base before answering any questions.
- Only respond to questions using information from tool calls.
- Respond in the user's language
- Always use between 1-3 tools pr user question
- Current date: ${new Date().toISOString().split('T')[0]}


${selectedBlobs.length > 0 ? `IMPORTANT: User has selected ${selectedBlobs.length} documents: ${selectedBlobs.join(', ')}. You MUST use the searchDocumentsTool to search these documents for relevant information.` : ''}

Select tools based on the legal domain of the question.`;

And each tool also have some information related to how to make the links and context from my vector db.

I have tried many different prompts, both long and very detailed, but after many test I found that a short prompt will have the highest change of the AI model executing tools in follow-up questions.

But again I think it is a Gemini issue, since i have not experienced any errors of issue with the packages. However if I use gpt-4o-mini it works as expected witch i find odd

I get this similar issue for gemini-2.5 pro . Sometimes i will execute the tool after the initial respond and then answer again and sometimes it will execute the tool and stop and sometimes it will not execute any tools and just hallucinate an answer.

Flash and other model works correclty when i call multi tool function

akv2011 avatar Apr 14 '25 18:04 akv2011

@ElectricCodeGuy are you able to reach out to me at [email protected], I want to setup a call to get to the bottom of this

samdenty avatar Apr 23 '25 19:04 samdenty

@samdenty Can you check out my previous comment it got lost in akv2011's comments

https://github.com/vercel/ai/issues/5717#issuecomment-2802550183

cgilly2fast avatar Apr 24 '25 00:04 cgilly2fast

@cgilly2fast another way that would help me is if you're able to create a reproduction repo with code that gives an error / shows the issue. I believe the one your running into is separate from the tool calling issue (schema issue so may make sense to open a separate issue)

samdenty avatar Apr 24 '25 00:04 samdenty

@samdenty gotcha thanks

cgilly2fast avatar Apr 24 '25 03:04 cgilly2fast

@ElectricCodeGuy are you able to reach out to me at [email protected], I want to setup a call to get to the bottom of this

I will E-mail you as soon as im back from holiday. I found an interesting read on reddit regarding the same issue(i think) https://www.reddit.com/r/Bard/comments/1k6w4ff/gemini_25_pro_stops_thinking_in_longer_contexts/

The model appears to answer back instantly when it misbehaves. I should properly also mention that each tool returns between 40k-100k tokens witch seem to correlate with the issue described in the reddit post.

ElectricCodeGuy avatar Apr 24 '25 21:04 ElectricCodeGuy

@samdenty I figured it out its these z.object() and z.string().url()

cgilly2fast avatar May 06 '25 05:05 cgilly2fast

getting the same error

AI_APICallError: * GenerateContentRequest.tools[0].function_declarations[0].parameters.properties[urls].items.format: only 'enum' and 'date-time' are supported for STRING type

Any fix around?

Our users are experimenting this issue in production app.

Nishchit14 avatar Jul 22 '25 09:07 Nishchit14

@cgilly2fast can you please explain more on how did you solve this issue?

Nishchit14 avatar Jul 22 '25 13:07 Nishchit14

@Nishchit14 I replaced these: z.object() and z.string().url() with more generic types like any or string. You can use z.object() for the top level schema but not inside the schema:

tool({
parameters: z.object({ //allowed
      customer: z.object() // not allowed
    })
})

This what did but I would be surprised if things have change. When it errored for me it gave my array index of the tools gemini had issues with and then was able figure out params it didnt like. Hope that helps

cgilly2fast avatar Jul 22 '25 17:07 cgilly2fast

@cgilly2fast Thanks for your help. In my case I am not calling any tools. The error is happening with while direct talking with gemini model.

@lgrammel any help around is appreciated. Thanks for this amazing sdk work.

Nishchit14 avatar Jul 23 '25 11:07 Nishchit14

I use gemini flash 2.5 with multiple tools. The schema is:

z.object({
    questions: z.array(z.object({
      question: z.string(),
      answer: z.string(),
    })),
  })

Sometimes, the model will output only text, without tool call, though the text indicates it will call tool.

I use openrouter provider, and has post an issue: https://github.com/OpenRouterTeam/ai-sdk-provider/issues/166

fwang2002 avatar Aug 28 '25 15:08 fwang2002