ai icon indicating copy to clipboard operation
ai copied to clipboard

Google Gen AI: Context caching breaks with generateObject / generateContent -

Open chanmathew opened this issue 1 year ago • 7 comments

Description

Hi there,

I'm trying to implement context caching with the Gemini models, but it keeps returning an error saying:

CachedContent can not be used with GenerateContent request setting system_instruction, tools or tool_config. Proposed fix: move those values to CachedContent from GenerateContent request.

But I'm not passing in system_instruction, tools, or tool_config in my request. Not sure if I'm doing something wrong here?

export const extractedPlacesSchema = z.object({
	results: z.array(
		z.object({
			name: z.string(),
			link: z.string().nullable(),
			image: z.string().nullable(),
			city: z.string().nullable(),
			country: z.string().nullable()
		})
	)
});

const newCache = await cacheManager.create({
		model,
		displayName: crypto.randomUUID(),
		systemInstruction: prompt,
		contents: [
			{
				role: 'user',
				parts: [{ text }]
			}
		],
		ttlSeconds: 60
	});
	
const { object } = await generateObject({
	model: google(model, {
		cachedContent: chunkCachedName,
		safetySettings
	}),
	temperature: 0,
	schema: extractedPlacesSchema,
	prompt: 'Extract from the provided content.'
});

It seems to work fine with generateText, however as soon as I use generateObject or generateContent it fails with that error.

Also just to clarify, if the system prompt is already included in the cacheManager, I don't need to specify it again in the generateObject right?

Code example

Here is the full error, you can see in the requestBodyValues there is a systemInstruction object being passed, potentially being injected by the SDK? perhaps that is the issue?

 Error on attempt 3/3: APICallError [AI_APICallError]: CachedContent can not be used with GenerateContent request setting system_instruction, tools or tool_config.

 Proposed fix: move those values to CachedContent from GenerateContent request.
     at file:///Users/mathew/Dev/talewind-app/node_modules/@ai-sdk/provider-utils/dist/index.mjs:431:14
     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
     at async postToApi (file:///Users/mathew/Dev/talewind-app/node_modules/@ai-sdk/provider-utils/dist/index.mjs:336:28)
     at async GoogleGenerativeAILanguageModel.doGenerate (file:///Users/mathew/Dev/talewind-app/node_modules/@ai-sdk/google/dist/index.mjs:364:50)
     at async fn (/Users/mathew/Dev/talewind-app/node_modules/ai/dist/index.mjs:2049:33)
     at async eval (/Users/mathew/Dev/talewind-app/node_modules/ai/dist/index.mjs:299:22)
     at async _retryWithExponentialBackoff (/Users/mathew/Dev/talewind-app/node_modules/ai/dist/index.mjs:129:12)
     at async fn (/Users/mathew/Dev/talewind-app/node_modules/ai/dist/index.mjs:2017:34)
     at async eval (/Users/mathew/Dev/talewind-app/node_modules/ai/dist/index.mjs:299:22)
     at async processChunk (/Users/mathew/Dev/talewind-app/apps/creator-app/src/lib/server/crawl.server.ts:224:24) {
   cause: undefined,
   url: 'https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-002:generateContent',
   requestBodyValues: {
     generationConfig: {
       topK: undefined,
       maxOutputTokens: undefined,
       temperature: 0,
       topP: undefined,
       stopSequences: undefined,
       responseMimeType: 'application/json',
       responseSchema: [Object]
     },
     contents: [ [Object] ],
     systemInstruction: { parts: [Array] },
     safetySettings: [ [Object], [Object], [Object], [Object] ],
     cachedContent: 'cachedContents/h7yj08xq8pe3'
   },
   statusCode: 400,
   responseHeaders: {
     'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000',
     'cache-control': 'private',
     'content-encoding': 'gzip',
     'content-type': 'application/json; charset=UTF-8',
     date: 'Tue, 22 Oct 2024 18:20:08 GMT',
     server: 'scaffolding on HTTPServer2',
     'server-timing': 'gfet4t7; dur=84',
     'transfer-encoding': 'chunked',
     vary: 'Origin, X-Origin, Referer',
     'x-content-type-options': 'nosniff',
     'x-frame-options': 'SAMEORIGIN',
     'x-xss-protection': '0'
   },
   responseBody: '{\n' +
     '  "error": {\n' +
     '    "code": 400,\n' +
     '    "message": "CachedContent can not be used with GenerateContent request setting system_instruction, tools or tool_config.\\n\\nProposed fix: move those values to CachedContent from GenerateContent request.",\n' +
     '    "status": "INVALID_ARGUMENT"\n' +
     '  }\n' +
     '}\n',
   isRetryable: false,
   data: {
     error: {
       code: 400,
       message: 'CachedContent can not be used with GenerateContent request setting system_instruction, tools or tool_config.\n' +
         '\n' +
         'Proposed fix: move those values to CachedContent from GenerateContent request.',
       status: 'INVALID_ARGUMENT'
     }
   },
   [Symbol(vercel.ai.error)]: true,
   [Symbol(vercel.ai.error.AI_APICallError)]: true
 }

Additional context

"@ai-sdk/google": "^0.0.51", "ai": "^3.4.18", "@google/generative-ai": "^0.21.0",

chanmathew avatar Oct 22 '24 18:10 chanmathew

I can see how this happens with tool mode. Can you try mode: 'json' and see if that helps?

lgrammel avatar Oct 23 '24 15:10 lgrammel

@lgrammel Added that but still same error unfortunately.

chanmathew avatar Oct 23 '24 18:10 chanmathew

still no solution?

Albin0903 avatar Jan 09 '25 01:01 Albin0903

I have started to get this error too in code that was working before

radicalgeek avatar Jan 09 '25 11:01 radicalgeek

it seems that now only version 1.5-001 supports cached content, lol.

Albin0903 avatar Jan 09 '25 13:01 Albin0903

Confirmed. caching is still working with 1.5-001. Not sure what that means for the future. I hope this is just a temporary mistake. The 1.5 models are supposed to be "stable". 1.5-001 seems to be much stricter with the minimum token requirements though. I did not need anywhere near as many tokens to create a cache with 1.5-002 when it was working.

radicalgeek avatar Jan 09 '25 22:01 radicalgeek

Here's a method described that I use to make context caching work, even with function calling. It's hacky and could use some improvements in the SDK to better facilitate this.

https://github.com/vercel/ai/issues/3212#issuecomment-2812761306

ItsWendell avatar Apr 17 '25 13:04 ItsWendell