IBM Granite prompt result changes after using the Logits probability button
Describe the bug
When using the IBM Granite 4 GGUF model, after pressing the Logits probability button once, the result is incorrect for any future uses of the LLM until the prompt is changed.
This likely applies to any model using Mamba-2 layers, though I did not test this beyond confirming that most models do not have this problem.
Is there an existing issue for this?
- [x] I have searched the existing issues
Reproduction
Load the Granite4-small base model in GGUF form. Enter any prompt more than two tokens long into the Notebook section. Press the Logits probability button twice, and observe that the results are different. Alternatively: Generate some text from a prompt with Deterministic sampling, press the Logits probability button once, then press Generate again and note that the text is different (and significantly worse).
Screenshot
No response
Logs
N/A
System Info
Nvidia GPU, Windows 11