text-generation-webui
text-generation-webui copied to clipboard
text generation webui llama generating random nonsense
Describe the bug
when running llama 7b 4 bits groupsize 32 on text generation webui, i get completely nonsense responses for example: This is a conversation with your Assistant. The Assistant is very helpful and is eager to chat with you and answer your questions. You: hi Assistant: ekdia →dra defectRT”ÄRTRTRTRTRTRTRTRTRTRTRT You: why are you typing up random stuff Assistant: ädâDOCC Assistant:
Is there an existing issue for this?
- [X] I have searched the existing issues
Reproduction
run llama 7b in any way on text generation webui and try chatting to it
Screenshot
No response
Logs
(/home/floppa/oobagooba/installer_files/env) floppa@flop-PC:~/oobagooba/text-generation-webui$ python server.py --cai-chat --verbose --cpu-memory 4GB --wbits 4 --groupsize 32 --auto-device --gpu-memory 16 --listen --listen-port 7861 --extensions llama_prompts api long_term_memory --model llama_7b
Gradio HTTP request redirected to localhost :)
Warning: --cai-chat is deprecated. Use --chat instead.
bin /home/floppa/oobagooba/installer_files/env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so
Loading llama_7b...
Found the following quantized model: models/llama_7b/llama-7b-4bit-32g.safetensors
Loading model ...
Done.
Using the following device map for the quantized model: {'': 0}
Loaded the model in 2.82 seconds.
Loading the extension "api"... Ok.
Loading the extension "gallery"... Starting KoboldAI compatible api at http://0.0.0.0:5000/api
Ok.
Running on local URL: http://0.0.0.0:7861
To create a public link, set `share=True` in `launch()`.
This is a conversation with your Assistant. The Assistant is very helpful and is eager to chat with you and answer your questions.
You: hi
Assistant:
--------------------
Output generated in 13.18 seconds (15.09 tokens/s, 199 tokens, context 36, seed 1654316909)
This is a conversation with your Assistant. The Assistant is very helpful and is eager to chat with you and answer your questions.
You: hi
Assistant: ekdia →dra defectRT”ÄRTRTRTRTRTRTRTRTRTRTRT
You: why are you typing up random stuff
Assistant:
--------------------
Output generated in 12.85 seconds (15.48 tokens/s, 199 tokens, context 70, seed 441487768)
This is a conversation with your Assistant. The Assistant is very helpful and is eager to chat with you and answer your questions.
You: hi
Assistant: ekdia →dra defectRT”ÄRTRTRTRTRTRTRTRTRTRTRT
You: why are you typing up random stuff
Assistant: ädâDOCCÂ
Assistant:
--------------------
Output generated in 12.94 seconds (15.38 tokens/s, 199 tokens, context 80, seed 69881342)
^CTraceback (most recent call last):
File "/home/floppa/oobagooba/text-generation-webui/server.py", line 923, in <module>
time.sleep(0.5)
KeyboardInterrupt
System Info
windows 11 with wsl ubuntu
nvidia rtx 3090
intel core i9 12900k
same here
i had as similar problem, make sure you are running the correct model. Like cuda vs triton or something like that. Use cuda, delete the other.
delete what and where?
i had as similar problem, make sure you are running the correct model. Like cuda vs triton or something like that. Use cuda, delete the other.
im running the llama 7b models given in the huggingface link that is on the doc page for using llama on this webui, i tried both of them and got the same result
same issue so this means this issue isnt 4 days its five days old. https://github.com/oobabooga/text-generation-webui/issues/1554 this may relate to more than just amd/windows builds. however they all have windows in common, all nvidia cards
torch 2.0.0 issue with cuda?
Same issue with Neko-Institute-of-Science_LLaMA-13B-4bit-128g on Ubuntu and Nvidia. Regardless of the settings the model produces repetitive random noise.
https://github.com/jllllll/one-click-installers
made pull request and updated all oobabooga web ui or download and replace files. https://github.com/oobabooga/text-generation-webui/commits/main
solved with updated installer and cuda installed 11.8[tesla m40] and 12.1 [modern cards]
I spend an entire night on it now. Initially the one-click installer fixed it for me, but now I am lost...
I spend an entire night on it now. Initially the one-click installer fixed it for me, but now I am lost...
I have not used webui for two weeks due to this error and moved to llama.cpp which works great. I think the problem of python projects including this repo is that python developers cannot control the quality of the program. It really needs a test based development.
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.