generative-ai-js icon indicating copy to clipboard operation
generative-ai-js copied to clipboard

Gemini 1.5 Flash: Candidate was blocked due to RECITATION when responseMimeType is json

Open marian2js opened this issue 1 year ago • 98 comments

Description of the bug:

When responseMimeType: 'application/json', a request is failing with error: [GoogleGenerativeAI Error]: Candidate was blocked due to RECITATION.

However, without responseMimeType, the same prompt works (returns a markdown with json).

The exact same instructions and prompt work on the AI Studio, even with output in JSON on.

// The error happens even if safety settings are set to block none.
const safetySettings = [
  {
    category: HarmCategory.HARM_CATEGORY_HARASSMENT,
    threshold: HarmBlockThreshold.BLOCK_NONE,
  },
  {
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_NONE,
  },
  {
    category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
    threshold: HarmBlockThreshold.BLOCK_NONE,
  },
  {
    category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
    threshold: HarmBlockThreshold.BLOCK_NONE,
  },
]

const model = this.genAI.getGenerativeModel({
  model: 'gemini-1.5-flash-latest',
  systemInstruction: instructions,
  safetySettings,
})

const generationConfig = {
  temperature: 0,
  topP: 0.95,
  topK: 64,
  maxOutputTokens: 8192,
  responseMimeType: 'application/json', // fails only if this option is sent. 
}

const chatSession = model.startChat({
  generationConfig,
})

const result = await chatSession.sendMessage(prompt)
const text = result.response.text() // throws [GoogleGenerativeAI Error]: Candidate was blocked due to RECITATION.

Actual vs expected behavior:

Actual: Throws [GoogleGenerativeAI Error]: Candidate was blocked due to RECITATION Expected: Return the same result as in the AI Studio.

Any other information you'd like to share?

No response

marian2js avatar May 15 '24 14:05 marian2js

Hi @marian2js, sorry for the troubles. Does every prompt cause this, or only specific prompts? If it's a specific prompt, are you able to share the prompt (and system instructions) so we can try to reproduce it?

ryanwilson avatar May 15 '24 15:05 ryanwilson

Hi @ryanwilson. The issue happens only with a very specific prompt that I cannot share publicly. I've been trying to remove the personal data from it, but as soon as I do, it starts working.

I noticed that when I remove the responseMimeType, the JSON returned in the markdown is invalid as it has a js variable: { "key": value }. However, with a different prompt the model returned the invalid json, so I don't know if the issue is related to that.

I am sorry for not being of more help.

marian2js avatar May 15 '24 16:05 marian2js

No worries! Out of curiousity, do you run into the same issue if you use sendMessageStream instead of sendMessage? That could be the difference with AI Studio, where the response is streamed.

ryanwilson avatar May 15 '24 17:05 ryanwilson

Hi @ryanwilson, the bug doesn't happen with sendMessageStream. I made changes to my prompt, so this bug is not triggered anymore for me. But I can confirm the bug is still happening with my old prompt and sendMessage.

marian2js avatar May 19 '24 09:05 marian2js

Hi guys,

I'm experiencing the exact same problem as @marian2js. We're using Gemini with Vertex to extract structured data (as JSON) from job offer listing PDFs.

Here's what I've tried so far:

  • Using the 1.5 Pro model instead of the Flash model
  • Switching to sendMessageStream from sendMessage
  • Using 1.0 Pro (works, but the output is non-usable)
  • Changing region, originally using asia-east2, switched to europe-west9, same result

Oddly enough, the issue seems to only happen when processing non-English documents. When I upload PDFs in French, I always run into the RECITATION problem.

On the flip side, if I use Google Vision for OCR on the PDF, and then use the Vercel AI SDK with chat and Gemini 1.5 Flash, it works perfectly with the same prompt, but on the OCR data (string) instead of the inline PDF.

Hope this info helps in figuring out the issue!

florian583 avatar May 20 '24 09:05 florian583

Hi folks,

Running into the same problem as @marian2js (with almost identical API call settings but with Golang - model conf, generationConfig, SafetyConfig all the same). Again parsing a plaintext file with a big prompt to output JSON (where the plaintext was originally converted from PDF).

Setting model.GenerationConfig.ResponseMIMEType = "application/json" hits the "blocked: candidate: FinishReasonRecitation" resp with no PromptFeedback value, annoyingly, so flying a bit blind as to the cause.

If it's helpful the content is senior school syllabus so shouldn't be anywhere close to hitting any of the harm response thresholds either way.

Commenting ResponseMIMEType with a slight prompt modification gets a usable (yet obviously inconsistent) result.

Much appreciated!

GuyVivedus avatar May 20 '24 13:05 GuyVivedus

Hi all, Running into the same issue, model: 'gemini-1.5-flash-latest' , responseMimeType: 'application/json'.

I am parsing 50 text documents by passing a json schema for the desired output, and more than half of them fail, but always the same ones.

Setting the response type to stream does not fix it and the error occurs always in the same chunk.

Prompt is f"""You will be provided content in the form of html. Using this content, return a valid json object that is based entirely on information from the content, not your guess. The content should satisfy the following json schema: {schema} """

KlimentP avatar May 21 '24 17:05 KlimentP

Experiencing same issue as well, same setup as others above.

Also get it if i run generateContent:

const generationConfig = {
temperature: 1,
topP: 0.95,
topK: 64,
maxOutputTokens: 8192,
responseMimeType: 'application/json'
}

const safetySettings = [
{
	category: HarmCategory.HARM_CATEGORY_HARASSMENT,
	threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
},
{
	category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
	threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
},
{
	category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
	threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
},
{
	category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
	threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
}
]
const model = genAI.getGenerativeModel({
		model: 'gemini-1.5-flash-latest',
		generationConfig,
		safetySettings,
		systemInstruction:
			'My prompt here.'
	})

const result = await model.generateContent(content)

const response = result.response
const text = response.text()

chanmathew avatar May 22 '24 14:05 chanmathew

Changing to stream mode worked for me as well:

const safetySettings = [
   {
     category: HarmCategory.HARM_CATEGORY_HARASSMENT,
     threshold: HarmBlockThreshold.BLOCK_NONE,
   },
   {
     category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
     threshold: HarmBlockThreshold.BLOCK_NONE,
   },
   {
     category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
     threshold: HarmBlockThreshold.BLOCK_NONE,
   },
   {
     category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
     threshold: HarmBlockThreshold.BLOCK_NONE,
   },
];


try {
           const chatSession = this.model.startChat({
               generationConfig: {
                   temperature: 1,
                   topP: 0.95,
                   topK: 64,
                   maxOutputTokens: 8192,
                   responseMimeType: "text/plain",
               },
               history: history,
               systemInstruction: systemMessage,
               safetySettings: safetySettings
           });
           
           var stream = true;
           if (stream) {
               const result = await chatSession.sendMessageStream(currentMessage);
             for await (const item of result.stream) {
                   console.log("Stream chunk: ", item.candidates[0].content.parts[0].text);
               } 
               const aggregatedResponse = await result.response;
               return aggregatedResponse.text();
           } else {
               const result = await chatSession.sendMessage(currentMessage);
               return result.response.text();
           }
       } catch (error) {
           console.error("Error during chat session:", error);
           throw error; // Re-throw the error after logging or handle it as needed
       }

ShivQumis avatar May 23 '24 14:05 ShivQumis

I started to receive this today as well, didn't get it at all yesterday and I made hundreds of calls, now today a significant number of them are receiving the error.

yharaskrik avatar May 24 '24 01:05 yharaskrik

Also getting this error.

jsomeara avatar May 27 '24 03:05 jsomeara

Any updates on this? As it stands at the current moment, using Gemini Pro 1.5 for RAG-base Q/A with chat history is incredibly unreliable.

Simply ask 2-3 questions about a document with some overlap, and you will encounter a RECITATION error almost guranteed.

I.e, using LangChain:

const chatTemplate = ChatPromptTemplate.fromMessages([['system', 'You are a helpful assistant. Answer all questions to the best of your ability.'], new MessagesPlaceholder('history'), new MessagesPlaceholder('input')]);
const chain = chatTemplate.pipe(pdfModel).pipe(new StringOutputParser());

const chainWithHistory = new RunnableWithMessageHistory({
		runnable: chain,
		getMessageHistory: () => chatHistory,
		inputMessagesKey: 'input',
		historyMessagesKey: 'history',
		config: { configurable: { sessionId: decodedToken.uid } }
	});

const response = await chainWithHistory.invoke({ input: input });

If you simply ask the model to recite a specific section of a file/PDF twice in a row, you will get a Recitation error GURANTEED (whether you pass the input File in once at the beginning of the conversation, only in the latest message, or as part of every message input is irrelevant).

Kashi-Datum avatar May 27 '24 22:05 Kashi-Datum

I'm getting the same error. Isn't there a way to know in advance if the input contains something which might cause this error, and then skip that ??

Aftab-M avatar Jun 09 '24 17:06 Aftab-M

Hi @ryanwilson, the bug doesn't happen with sendMessageStream. I made changes to my prompt, so this bug is not triggered anymore for me. But I can confirm the bug is still happening with my old prompt and sendMessage.

I have tried the same, it seems the issue still persists with sendMessageStream as well. One more interesting thing I found was, when using sendMessageStream, along with some tools, it is producing 500 Internal Server Error.

HarshavardhanNetha avatar Jun 12 '24 03:06 HarshavardhanNetha

I have the same problem, I'm sending the screenshot of a PDF, it works in some cases but not in others

0xthierry avatar Jun 12 '24 20:06 0xthierry

i also have the same problem.

i ran 1.0 pro vs 1,5 flash (as i got the email for it becoming deprecated soon) - i tried 1.5 pro as well

when running 1.0 & 1.5 pro, i have zero issues, no matter how many time i retry the same content generation in a row, or how many times i send (assuming im not limited)

pro

yet, when i run gemini-1.5-flash, exact same prompt, exact same code, i would randomly get the recitation error (that i am currently not handling in catch to just see it easily on console)

flash

thing is, its not even particularly faster than the 1.0 pro. the 1.5 pro also does not give this error.

i am running the free version so im comparing 1.0 pro and 1,5 flash mostly as they both have similar request limitations.

i did not change any settings, didnt set up safety or response type or anything, so all there should be the default

jouwana avatar Jun 13 '24 03:06 jouwana

i also have the same problem.

i ran 1.0 pro vs 1,5 flash (as i got the email for it becoming deprecated soon) - i tried 1.5 pro as well

when running 1.0 & 1.5 pro, i have zero issues, no matter how many time i retry the same content generation in a row, or how many times i send (assuming im not limited)

pro

yet, when i run gemini-1.5-flash, exact same prompt, exact same code, i would randomly get the recitation error (that i am currently not handling in catch to just see it easily on console)

flash

thing is, its not even particularly faster than the 1.0 pro. the 1.5 pro also does not give this error.

i am running the free version so im comparing 1.0 pro and 1,5 flash mostly as they both have similar request limitations.

i did not change any settings, didnt set up safety or response type or anything, so all there should be the default

By any chance, have you tried using tools? And experimented with various models? Also, any work done on sendMessage & sendMessageStream?

HarshavardhanNetha avatar Jun 13 '24 03:06 HarshavardhanNetha

By any chance, have you tried using tools? And experimented with various models? Also, any work done on sendMessage & sendMessageStream?

i am unsure what 'tools' is, so probably havent tried it.

i have tried different things with the different models, in prompt form, length, sending same one in a row vs sending different ones in a row, only Flash has the problem, and if pops up no matter what prompts im sending, usually after 2-3 sends, sometimes ca last to 4-5 without error but not as common.

as for sendMessage and sendMessageStream, i am mostly sending unrelated prompts, which is why i preferred using generateContent over openChat and sendMessage, and so i havent tried them for now.

jouwana avatar Jun 13 '24 03:06 jouwana

@hsubox76 just wanted to page contributors here, this is a MAJOR bug, especially when you guys are sunsetting 1.0 pro and moving everyone to 1.5 flash. It is literally unusable right now with this recitation error. This is the only reason we cannot migrate over from other foundation models. You are losing out on customers.

Is anyone on the team looking at this?

chanmathew avatar Jun 13 '24 05:06 chanmathew

Same problem here using Gemini 1.5 Pro

httplups avatar Jun 17 '24 15:06 httplups

I have this problem using gemini-1.5-flash. If I send the same prompt a second time it shows me this error: [GoogleGenerativeAI Error]: Candidate was blocked due to SAFETY

Prompt example: Write a romantic sentence that includes the number '5' written exactly as '5'. Make sure the number appears in numerical format and not in words.

If you send the same thing but changing only the number, it generates the error. For example: Write a romantic sentence that includes the number '73' written exactly as '73'. Make sure the number appears in numerical format and not in words.

holaggabriel avatar Jun 17 '24 20:06 holaggabriel

Same here; looks like this issue is stock under the backlog pile.

rartin avatar Jun 18 '24 16:06 rartin

This does seem like a serious issue - unfortunately it's beyond the ability of us to fix in the SDK. We're asking anyone to bring issues with the service or the models to the discussion forum: https://discuss.ai.google.dev/ where it will be more likely to reach those who cover the models themselves.

If posting in the forum, feel free to link back to this issue as it contains a lot of info and examples of the problem which might be helpful background, and if someone opens a thread in the discussion forum, please link it here so that all the JS SDK users facing this issue can add to that thread and hopefully get it looked at.

We will also try to get some answers internally but it's probably best for us to use all channels, including the discussion forum.

hsubox76 avatar Jun 18 '24 17:06 hsubox76

Hey folks, following up, we are investigating this and will share more details as soon as we have an update on our end!

logankilpatrick avatar Jun 18 '24 18:06 logankilpatrick

Hey folks, following up, we are investigating this and will share more details as soon as we have an update on our end!

Hi Logan, any updates on this?

matadornetwork avatar Jun 19 '24 21:06 matadornetwork

fyi - Here is a recitation failure I find reproducible and strange, the behavior/error differences between nodejs and console may be telling, perhaps the listing/encoding causes sensitivity but no changes to parameters can prevent error -

NO ERROR - CHESS OPENINGS - instructions prompt - System: You are a service that returns tidy, single line list of domain items responses to prompts. Your response is only a comma seperated lists of items like 'item1, item2, item3..'. List as many items as performantly possible to enrich the data. If prompted for a list of fruits, you respond with 'apple, pear, banana, ..'. Do not include ‘..’ in the response, instead add as many items (10-50) as needed to fulfil the request in a timely manner. \n\nHuman: Given these instructions, the prompt asks for a comma separated list of chess openings, return your tidy response of items. \n\nAssistant: Here is the requested list of comma seperated domain items;

RECITATION ERROR - POKEMON - instructions prompt - System: You are a service that returns tidy, single line list of domain items responses to prompts. Your response is only a comma seperated lists of items like 'item1, item2, item3..'. List as many items as performantly possible to enrich the data. If prompted for a list of fruits, you respond with 'apple, pear, banana, ..'. Do not include ‘..’ in the response, instead add as many items (10-50) as needed to fulfil the request in a timely manner. \n\nHuman: Given these instructions, the prompt asks for a comma separated list of pokemon, return your tidy response of items. \n\nAssistant: Here is the requested list of comma seperated domain items;

pinballsurgeon avatar Jun 21 '24 12:06 pinballsurgeon

I am facing the same issue. Following this thread.

saranggupta94 avatar Jun 22 '24 06:06 saranggupta94

Facing the same issue with 1.5 Pro.

shashank734 avatar Jun 22 '24 13:06 shashank734

Facing the same issue for a RAG use case using gemini-1.5.pro. It's very unpredictable as the error depends on the output and not always the context/prompt.

samsondb avatar Jun 23 '24 20:06 samsondb

Facing the same problem.

chreds avatar Jun 25 '24 13:06 chreds