Bug Report: The model frequently generates repetitive token sequences.
Description of the bug:
No response
Actual vs expected behavior:
No response
Any other information you'd like to share?
No response
Bug Report: Repetitive Token Generation in "gemini-1.5-flash" Model
Description of the Bug: When generating long texts using the "gemini-1.5-flash" model, repetitive token sequences frequently occur, resulting in infinite loops and exhausting the token limit. This behavior is consistent across both the Vertex and Gemini APIs.
Example:
"The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be
Steps to Reproduce:
-
Use the "gemini-1.5-flash" model via Vertex or Gemini API.
-
Generate a long text (e.g., a legal or technical document).
-
Observe the generated output for repeated phrases or sentences.
Expected Behavior: The model should produce coherent, non-repetitive text.
Actual Behavior: The model enters a repetitive loop, generating the same token sequences indefinitely until the token limit is reached.
Impact:
Resource Waste: Tokens are wasted, increasing costs and exhausting API usage limits.
Output Quality: The generated text becomes unusable, requiring additional API requests.
Reproduction Rate: Occurs frequently when generating long-form text.
Workaround: There is currently no known workaround to prevent this issue.
Request for Resolution:
-
Investigate and resolve the cause of repetitive token generation.
-
Implement a mechanism to detect and avoid repetitive loops during generation.
-
Consider offering refunds or credits for tokens wasted due to this bug.
Actual vs. Expected Behavior: Actual Output:
"The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed..."
Expected Output:
"The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly."
Hi @Razaghallu786,
Could you please provide a bit more clarification on this? Is this happening with some features like function calling or structured output or just simply running the above prompt??
Which temperature are you using? If you are using 0, can you try with a higher one?
If there is no update please close this issue @Razaghallu786
Here are a few ideas I've been exploring to tackle the repetitive token generation issue:
Tuning Generation Parameters: You might try adjusting the generation parameters—specifically, tweaking the temperature, top_p, and top_k values. This can sometimes help reduce unwanted repetition.
Simplify Prompts / Advanced Prompting: Often, the problem occurs when the model struggles with complex prompts. Simplifying the prompt can make a difference. Alternatively, you can experiment with advanced prompting methods like chain-of-thought (CoT) or even few-shot prompting to guide the model more effectively.
Model Variant Upgrade: If you're using a Gemini model, consider testing a higher variant (for example, switching from Gemini-1.5-flash to Gemini-2.0 Flash) to see if that improves the output.
Fine-Tuning Considerations: If you're using a fine-tuned model, try adjusting how you're calling the model to see if that impacts the repetition. Also, double-check that the dataset used for fine-tuning is suitable and that the fine-tuning process didn't introduce issues.
Framework Issues: If the model is being accessed with a framework, try running it standalone to determine if the problem might be related to the framework's implementation.
If you still encounter issues after trying these approaches, please share more details so I can investigate further.