ai must be specified when not using one_of using zod.union() on google provider

Description

Related issue: https://github.com/vercel/ai/issues/4153

Error doesn't happen if structuredOutputs: false

Code example

const model = google('gemini-2.0-flash', { structuredOutputs: true })

const mainSchema = z.array(z.union([subSchema1, subSchema2]))

const response = await generateObject({
      model: this.model,
      messages,
      schema: z.object({
        chain_of_thought: z.string(),
        locator: mainSchema,
      }),
    });

AI provider

@ai-sdk/[email protected]

Additional context

No response

Feb 05 '25 21:02 p3droml

I have seen similar with enums. I tried to use the new gemini-2.0-flash-001 model in place of where Im using open ai to generate and object that has an property that is an array of objects value; those objects have an enum. And I got an error about invalid JSON input. I tried the structureOutputs: false and it ran but still failed the zod schema validation.

For reference my schema looks like the following:

const MockSchema = z.object({
  fieldA: z.string(),
  fieldB: z.number(),
  fieldC: z.array(z.object({
    fieldD: z.enum(['a', 'b', 'c']),
  }))
})

And the error looks like:

responseBody: '{\n' +
    '  "error": {\n' +
    '    "code": 400,\n' +
    `    "message": "Invalid JSON payload received. Unknown name \\"type\\" at 'generation_config.response_schema.proper
ties[2].value.items.properties[2].value.any_of[0].properties[2].value.items.properties[2].value': Proto field is not rep
eating, cannot start list.\\nInvalid value at 'generation_config.response_schema.properties[2].value.items.properties[2]
.value.any_of[1].type' (type.googleapis.com/google.ai.generativelanguage.v1beta.Type), \\"null\\"",\n` +
    '    "status": "INVALID_ARGUMENT",\n' +
    '    "details": [\n' +
    '      {\n' +
    '        "@type": "type.googleapis.com/google.rpc.BadRequest",\n' +
    '        "fieldViolations": [\n' +
    '          {\n' +
    '            "field": "generation_config.response_schema.properties[2].value.items.properties[2].value.any_of[0].pro
perties[2].value.items.properties[2].value",\n' +
    `            "description": "Invalid JSON payload received. Unknown name \\"type\\" at 'generation_config.response_s
chema.properties[2].value.items.properties[2].value.any_of[0].properties[2].value.items.properties[2].value': Proto fiel
d is not repeating, cannot start list."\n` +
    '          },\n' +
    '          {\n' +
    '            "field": "generation_config.response_schema.properties[2].value.items.properties[2].value.any_of[1].typ
e",\n' +
    `            "description": "Invalid value at 'generation_config.response_schema.properties[2].value.items.propertie
s[2].value.any_of[1].type' (type.googleapis.com/google.ai.generativelanguage.v1beta.Type), \\"null\\""\n` +
    '          }\n' +
    '        ]\n' +
    '      }\n' +
    '    ]\n' +
    '  }\n' +
    '}\n',

Where the type the error mentions is the fieldD from the example

I know OpenAI struggles with some complex zod schema which I have to use Anthropic to handle; ie default(), discriminatedUnion(). So maybe Google has similar limitation?

Feb 06 '25 15:02 natac13

I am also running into the same error: "Invalid value at 'generation_config.response_schema.properties[1].value.any_of[1].type' (type.googleapis.com/google.ai.generativelanguage.v1beta.Type)"

I think @natac13 is right that the enums are the issue - my type is an enum. type: z.enum(['melee', 'ranged', 'spell'])

Feb 06 '25 15:02 williamlmao

Does this only happen on the 2.0 models or also on 1.5 models?

Feb 06 '25 15:02 lgrammel

@lgrammel Getting it on all google models - just tested 1.5 pro and 1.5 flash, same thing.

I'm also getting a * GenerateContentRequest.contents: contents is not specified error on generateText requests.

Feb 06 '25 15:02 williamlmao

I wonder if they changed something - was existing code suddenly broken or are these new schemas?

Feb 06 '25 16:02 lgrammel

I'm not sure, this is actually my first time using the google models so I can't say if it was working before or not - sorry I can't be of more help here!

Feb 06 '25 16:02 williamlmao

I just found this thread: https://discuss.ai.google.dev/t/oneof-in-response-schema/55926

It seems like one_of (zod union) is currently not supported by Google, though it's supported by OpenAI.

Feb 06 '25 16:02 p3droml

Does this only happen on the 2.0 models or also on 1.5 models?

I was trying out gemini-2.0 only.

I wonder if they changed something - was existing code suddenly broken or are these new schemas?

Not new schema. Just a schema that was working with 4o and 4o-mini, o1 also fails on the comlexitity of the schema.

I have found that openai does the 'best' with a huge generation with a somewhat simple schema. It can contain enums though.

When I need .default() or discriminatedUnion from zod even openai 4o fails with a hard error on the schema. It does mention it in their docs though. That is when I realized Anthropic 3.5 models can handle detailed schemas but have trouble generating a lot of info from those schemas.

Feb 06 '25 17:02 natac13

Optimal DX in this case would be for generateObject to always accept a valid zod schema and transform based on the provider and their specs. If oneOf is not supported, it should not be included in the request as such, but as instructions instead (possibly as description, not sure how it's currently implemented for models that don't support structured outputs), and the error should be thrown only if the model generated a response that could not be validated against the schema (https://sdk.vercel.ai/docs/ai-sdk-core/generating-structured-data#error-handling).

This approach is consistent with the ai sdks mission (https://sdk.vercel.ai/docs/introduction#why-use-the-ai-sdk).

Feb 08 '25 10:02 desiprisg

While the documentation for AI SDK states that one can temporarily work around with structuredOutputs: false, when I'm using it, it seems that using streamText with function call would still result in error -- perhaps the tool call is forced to be a structured output?

Feb 12 '25 16:02 KelvinCheng5507

My workaround for this was to convert my schema to JsonSchema and fill in any missing types with 'type: string'

I feel like this would be something easy to add to the library itself to do

Feb 12 '25 18:02 ghardin1314

@natac13 I just tried the following and it works:

  const result = await generateObject({
    model: google('gemini-2.0-flash', { structuredOutputs: true }),
    prompt: 'Generate a JSON object',
    schema: z.object({
      fieldA: z.string(),
      fieldB: z.number(),
      fieldC: z.array(
        z.object({
          fieldD: z.enum(['a', 'b', 'c']),
        }),
      ),
    }),
  });

  console.log(JSON.stringify(result.object, null, 2));

Feb 13 '25 08:02 lgrammel

Just spent hours on this (again). It boils down to a limitation of the vertex/gemini APIs. You can reproduce the behavior even w/o the AI SDK:

import 'dotenv/config';

async function main() {
  const response = await fetch(
    'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent',
    {
      method: 'POST',
      headers: {
        'x-goog-api-key': process.env.GOOGLE_GENERATIVE_AI_API_KEY!,
      },
      body: JSON.stringify({
        generationConfig: {
          temperature: 0,
          responseMimeType: 'application/json',
          responseSchema: {
            required: ['elements'],
            type: 'object',
            properties: {
              elements: {
                type: 'array',
                items: {
                  oneOf: [
                    {
                      type: 'object',
                      properties: {
                        age: { type: 'number' },
                      },
                      required: ['age'],
                    },
                    {
                      type: 'object',
                      properties: {
                        name: { type: 'string' },
                      },
                      required: ['name'],
                    },
                  ],
                },
              },
            },
          },
        },
        contents: [
          {
            role: 'user',
            parts: [
              {
                text: 'Generate a JSON object',
              },
            ],
          },
        ],
      }),
    },
  );

  console.log(await response.json());
}

main().catch(error => {
  console.error(JSON.stringify(error, null, 2));
});

which outputs:

{
  error: {
    code: 400,
    message: '* GenerateContentRequest.generation_config.response_schema.properties[elements].items.type: must be specified when not using one_of\n',
    status: 'INVALID_ARGUMENT'
  }
}

TLDR: The Gemini API does not support arrays with mixed content.

Feb 13 '25 09:02 lgrammel

@imaai please don't post unrelated comments. you can post this as a separate issue or discussion

Feb 13 '25 13:02 lgrammel

@lgrammel it seems that it's not just mixed types, but also .nullable() on enums. optional() works fine though.

// this will fail
const result = await generateObject({
  model: google('gemini-2.0-flash', { structuredOutputs: true }),
  prompt: 'Generate a JSON object',
  schema: z.object({
    fieldA: z.string(),
    fieldB: z.number(),
    fieldC: z.array(
      z.object({
        fieldD: z.enum(['a', 'b', 'c']).nullable(),
      })
    ),
  }),
});

I know the first thought is, well why don't you just make it optional instead? I've found that LLMs have a tendency to output null in json output instead of omitting a field. It's not something they do all the time, but it is something that happens frequently enough.

This is not so big of an issue, but the way zod schemas work with the AI SDK right now, outputting null would throw an error. This is why I think it'd be nice to have a built in 2-schema approach. 1) Here's the input schema I want to give to the LLM to try and follow, 2) Here's the schema I actually want to validate against.

Regarding this issue with the Gemini API, do you have an open convo with the Google team and a sense of whether

This is a problem Google is planning to fix
This is a problem the AI SDK will have a workaround for
This is just something we should deal with by changing our schemas

Thanks for looking into this Lars!

Feb 13 '25 13:02 williamlmao

@williamlmao can you open a separate issue for the null bug? I think that one might be solvable

Feb 13 '25 15:02 lgrammel

@lgrammel oh great - yep!

Feb 13 '25 17:02 williamlmao

I've encountered an issue while testing multiple models. Initially, I experienced this error with Gemini, and after testing with OpenAI, Mistral, and LLaMA, I received the following error across all of them:

[Error [AI_ToolExecutionError]: Error executing tool events: No object generated: response did not match schema.]

After some debugging, I found that the following configuration works across the different models:

mistral: mistral("mistral-small-latest"),
gemini: google("gemini-2.0-flash-exp", {
  structuredOutputs: false,
}),
openai: openai("gpt-4.1-mini", {
  structuredOutputs: false,
}),
llama: groq("meta-llama/llama-4-scout-17b-16e-instruct"),

And in the system prompt, I used this format to enforce the output schema:

const forceSchema = `
  ## Output Format
  - Provide the response in JSON format

  ## JSON Format
  summary: "Overall analysis summary of user events and interactions"
  highlights: "Three highest events"

  {
    "summary": "string",
    "highlights": [
      { "groupId": "string", "flag": "string", "value": number }
    ]
  }
`;

Here is the schema used for validation:

const schema = z.object({
  summary: z
    .string()
    .describe("Overall analysis summary of user events and interactions"),
  highlights: z
    .array(
      z.object({
        groupId: z.string().describe("Group ID of the frame"),
        flag: z.string().describe("Type of flag associated with the event"),
        value: z.number().describe("Value associated with the flag"),
      })
    )
    .describe("Three highest events"),
});

And this is the function call:

const { object } = await generateObject({
  model: registry.languageModel("general:llama"),
  prompt,
  temperature: 0.2,
  system: analyzer(processed),
  output: "object",
  mode: "json",
  schema,
});

return { response: object };

Apr 15 '25 14:04 CarlosPProjects

ai ai copied to clipboard

must be specified when not using one_of using zod.union() on google provider

Description

Code example

AI provider

Additional context

ai
ai copied to clipboard