text-generation-webui
text-generation-webui copied to clipboard
error when starting the 4bit model
Describe the bug
found and downloaded the llama here https://huggingface.co/elinas/llama-30b-int4/tree/main all files are in the folder text-generation-webui\models\llama30b When I run it, it says cuda is not installed, but I downloaded it, also can't find groupize
Is there an existing issue for this?
- [X] I have searched the existing issues
Reproduction
Screenshot
No response
Logs
C:\text-generation-webui>python server.py --gptq-bits 4 --gptq-model-type llama
Loading llama30b...
CUDA extension not installed.
Traceback (most recent call last):
File "C:\text-generation-webui\server.py", line 243, in <module>
shared.model, shared.tokenizer = load_model(shared.model_name)
File "C:\text-generation-webui\modules\models.py", line 101, in load_model
model = load_quantized(model_name)
File "C:\text-generation-webui\modules\GPTQ_loader.py", line 64, in load_quantized
model = load_quant(str(path_to_model), str(pt_path), shared.args.gptq_bits)
TypeError: load_quant() missing 1 required positional argument: 'groupsize'
System Info
64 gb ram
Maybe this is the wrong version of llama, then tell me a working version.
run git reset --hard 468c47c01b4fe370616747b6d69a2d3f48bab5e4 inside GPTQ-for-LLaMa folder.
Don't forget to recompile the cuda kernel
python setup_cuda.py install
Worked here. Thank you!
Hi! I have another error in this case
Loading alpaca-13b-lora-int4...
Traceback (most recent call last):
File "/home/text-generation/server.py", line 275, in <module>
shared.model, shared.tokenizer = load_model(shared.model_name)
File "/home/text-generation/modules/models.py", line 101, in load_model
model = load_quantized(model_name)
File "/home/text-generation/modules/GPTQ_loader.py", line 78, in load_quantized
model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize)
TypeError: load_quant() takes 3 positional arguments but 4 were given
Try to install run git reset --hard 468c47c01b4fe370616747b6d69a2d3f48bab5e4 inside GPTQ-for-LLaMa folder. Bot no effect
Hi! I have another error in this case
Loading alpaca-13b-lora-int4... Traceback (most recent call last): File "/home/text-generation/server.py", line 275, in <module> shared.model, shared.tokenizer = load_model(shared.model_name) File "/home/text-generation/modules/models.py", line 101, in load_model model = load_quantized(model_name) File "/home/text-generation/modules/GPTQ_loader.py", line 78, in load_quantized model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize) TypeError: load_quant() takes 3 positional arguments but 4 were givenTry to install run git reset --hard 468c47c01b4fe370616747b6d69a2d3f48bab5e4 inside GPTQ-for-LLaMa folder. Bot no effect
Webui currently uses the latest GPTQ-for-LLaMa. Don't use git reset --hard 468c47c01b4fe370616747b6d69a2d3f48bab5e4
The model he is loading won't work with the current GPTQ afaik.
Make sure to check this link for the up to date information. I keep it updated.
https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#4-bit-mode