gemini-ai icon indicating copy to clipboard operation
gemini-ai copied to clipboard

RECITATION finishReason Causing Content Generation Stops in Google Models

Open gbaptista opened this issue 1 year ago • 3 comments
trafficstars

Some Google models stop generating content due to finishReason = RECITATION.

According to the docs:

RECITATION: The token generation was stopped as the response was flagged for unauthorized citations.

gbaptista avatar Jun 23 '24 11:06 gbaptista

How to easily simulate it:

Give the first page of the first chapter of Harry Potter.

{
  "candidates":[
    {
      "finishReason":"RECITATION",
      "safetyRatings":[
        {
          "category":"HARM_CATEGORY_HATE_SPEECH",
          "probability":"NEGLIGIBLE",
          "probabilityScore":0.31806138,
          "severity":"HARM_SEVERITY_NEGLIGIBLE",
          "severityScore":0.13039611
        },
        {
          "category":"HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability":"NEGLIGIBLE",
          "probabilityScore":0.13764834,
          "severity":"HARM_SEVERITY_NEGLIGIBLE",
          "severityScore":0.0248928
        },
        {
          "category":"HARM_CATEGORY_HARASSMENT",
          "probability":"NEGLIGIBLE",
          "probabilityScore":0.44049937,
          "severity":"HARM_SEVERITY_NEGLIGIBLE",
          "severityScore":0.17050801
        },
        {
          "category":"HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability":"NEGLIGIBLE",
          "probabilityScore":0.24653332,
          "severity":"HARM_SEVERITY_LOW",
          "severityScore":0.20914645
        }
      ],
      "citationMetadata":{
        "citations":[
          {
            "startIndex":268,
            "endIndex":417,
            "uri":"https://www.lisarivero.com/2011/06/24/plain-and-fancy-words/"
          },
          {
            "startIndex":302,
            "endIndex":581,
            "uri":"https://thefriendlyeditor.com/2012/03/09/rowling-hook-page-one/"
          }
        ]
      }
    }
  ],
  "usageMetadata":{
    "promptTokenCount":12,
    "candidatesTokenCount":97,
    "totalTokenCount":109
  }
}

Of course, these are probably expected results, with Google trying to avoid generating copyrighted content. The issue is that there are too many false positives, significantly halting generations for many prompts.

gbaptista avatar Jun 23 '24 16:06 gbaptista

I have the same issue, I try to use Gemini for summarization. Naturally, summarization of copyrighted content would be flagged as "copyrighted content"; however, we have the explicit permission to use it.

maayanorner avatar Jun 29 '24 18:06 maayanorner

I'm getting this error constantly from non-copyrighted material. I'm just trying to extract data/snippets from public law texts and attachments, and all the requested output is present in the provided input.

naourass avatar Sep 29 '24 03:09 naourass