IBM Granite prompt result changes after using the Logits probability button

Open ssk97 opened this issue 1 month ago • 0 comments

Describe the bug

When using the IBM Granite 4 GGUF model, after pressing the Logits probability button once, the result is incorrect for any future uses of the LLM until the prompt is changed.

This likely applies to any model using Mamba-2 layers, though I did not test this beyond confirming that most models do not have this problem.

Is there an existing issue for this?

[x] I have searched the existing issues

Reproduction

Load the Granite4-small base model in GGUF form. Enter any prompt more than two tokens long into the Notebook section. Press the Logits probability button twice, and observe that the results are different. Alternatively: Generate some text from a prompt with Deterministic sampling, press the Logits probability button once, then press Generate again and note that the text is different (and significantly worse).

Screenshot

No response

Logs

N/A

System Info

Nvidia GPU, Windows 11

Nov 10 '25 06:11 ssk97