text-generation-webui
text-generation-webui copied to clipboard
LLaMa preset
Description
I finally got LLaMa 13b to run on my GPU in 4bit mode. It's fun - but also absolutely weird and insanely unpredictable. xD My guess is that this is due to the parameters being used. Now, I am not an expert in this, but seing as there are a couple of presets already shipped with the webui, it might be neat to have a preset tailored to LLaMa.
Additional Context
Does this need any further elaboration? ;)
Try use this settings from another web-ui: n_predict=200, repeat_last_n=64, repeat_penalty=1.3, top_k=40, top_p=0.9, temp=0.8
ps: Also, I would highly recommend using Alpaca-LORA for better output, apart from adding instructions, it is also cleaned of miscellaneous trash and gives more stable results.
That is a pretty reasonable text though lol XD
n_predict=200, repeat_last_n=64, repeat_penalty=1.3, top_k=40, top_p=0.9, temp=0.8
I put this into presets/LLaMa.txt
(changed to newlines, of course) and applied it:
PS C:\tools\text-generation-webui-installer\text-generation-webui\presets> type .\LLaMa.txt
n_predict=200
repeat_last_n=64
repeat_penalty=1.3
top_k=40
top_p=0.9
temp=0.8
And the launch script is this:
@echo off
@echo Starting the web UI...
set INSTALL_ENV_DIR=%cd%\installer_files\env
set PATH=%INSTALL_ENV_DIR%;%INSTALL_ENV_DIR%\Library\bin;%INSTALL_ENV_DIR%\Scripts;%INSTALL_ENV_DIR%\Library\usr\bin;%PATH%
call conda activate
cd text-generation-webui
call python server.py --auto-devices --load-in-4bit --model llama-13b-hf --listen-port 9999 --cai-chat
pause
And the replies are still quite wild. xD
Regenerate...
Regenerate...
This... did not work. x)
EDIT: I'll see to add a LoRA (new term, gonna learn about it first) and see if that changes anything.
Still in the same situation. However, after applying the latest updates to the ui itself and gptq, it is substantially faster now!
That said, it still refuses to write a Makefile...
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.