text-generation-webui llama.cpp sampling doesn't work in 1.15

llama.cpp sampling doesn't work in 1.15

Open GodEmperor785 opened this issue 5 months ago • 2 comments

Describe the bug

llama.cpp models always gives exactly the same output (compared in winmerge to be sure), like they ignore any sampling options and seed. Sometimes the first output after loading model is a bit different, but every regenerate gives exactly the same output. I also tried with temperature=5. I can see that seed changes in the console. Even reloading the model or restarting whole webui doesn't help.

This seems to happen only with llama.cpp loader, I tried with some exl2 models and they worked fine - outputs were different.

This doesn't seem model specific as I tried multiple GGUF models (which worked as expected in the past), like Mistral Nemo and Small and Qwen 2.5 32B. The same GGUFs worked before updating webui to 1.15 (with update_wizard_windows.bat as I usually did), so probably something in that update broke it?

This doesn't seem to be purely UI issue as I tried with SillyTavern over API too and effects were the same.

I also tried installing fresh copy of webui (git clone and start_windows.bat), but issue still happens on that fresh install.

There was a similar issue in the past #5451 - but in that case changing top_k helped - in my case it didn't help. Also the mentioned llama-cpp-python versions in that issue are very old (as the issue is old). I don't know if source of this problem is in webui or llama-cpp-python.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Load any GGUF model with llama.cpp loader
Generate any response and note it down
Regenerate multiple times with high temperature

The regenerated outputs are always the same

Screenshot

No response

Logs

Not sure what logs might be needed here

System Info

Windows 10
RTX 3090 - GPU driver 565.90
webui 1.15 - commit d1af7a41ade7bd3c3a463bfa640725edb818ebaf (newest on branch main)

Oct 03 '24 15:10 GodEmperor785

text-generation-webui text-generation-webui copied to clipboard

llama.cpp sampling doesn't work in 1.15

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

text-generation-webui
text-generation-webui copied to clipboard