text-generation-webui
text-generation-webui copied to clipboard
Monkey patch gives missing module error in docker version
Describe the bug
On a freshly rebuilt container, if I try loading with --monkey-patch I get this:
No module named 'autograd_4bit'
Perhaps a missing requirement?
Is there an existing issue for this?
- [X] I have searched the existing issues
Reproduction
git pull appended --monkey-patch to CLI_ARGS docker-compose up --build
Screenshot
Logs
[+] Running 1/0
✔ Container text-generation-webui-text-generation-webui-1 Recreated 0.0s
Attaching to text-generation-webui-text-generation-webui-1
text-generation-webui-text-generation-webui-1 |
text-generation-webui-text-generation-webui-1 | ==========
text-generation-webui-text-generation-webui-1 | == CUDA ==
text-generation-webui-text-generation-webui-1 | ==========
text-generation-webui-text-generation-webui-1 |
text-generation-webui-text-generation-webui-1 | CUDA Version 11.8.0
text-generation-webui-text-generation-webui-1 |
text-generation-webui-text-generation-webui-1 | Container image Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
text-generation-webui-text-generation-webui-1 |
text-generation-webui-text-generation-webui-1 | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
text-generation-webui-text-generation-webui-1 | By pulling and using the container, you accept the terms and conditions of this license:
text-generation-webui-text-generation-webui-1 | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
text-generation-webui-text-generation-webui-1 |
text-generation-webui-text-generation-webui-1 | A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
text-generation-webui-text-generation-webui-1 |
text-generation-webui-text-generation-webui-1 | Gradio HTTP request redirected to localhost :)
text-generation-webui-text-generation-webui-1 | bin /app/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
text-generation-webui-text-generation-webui-1 | Loading settings from settings.json...
text-generation-webui-text-generation-webui-1 | Loading mayaeary_pygmalion-6b_dev-4bit-128g...
text-generation-webui-text-generation-webui-1 | Warning: applying the monkey patch for using LoRAs in 4-bit mode.
text-generation-webui-text-generation-webui-1 | It may cause undefined behavior outside its intended scope.
text-generation-webui-text-generation-webui-1 | Traceback (most recent call last):
text-generation-webui-text-generation-webui-1 | File "/app/server.py", line 917, in <module>
text-generation-webui-text-generation-webui-1 | shared.model, shared.tokenizer = load_model(shared.model_name)
text-generation-webui-text-generation-webui-1 | File "/app/modules/models.py", line 118, in load_model
text-generation-webui-text-generation-webui-1 | from modules.monkey_patch_gptq_lora import load_model_llama
text-generation-webui-text-generation-webui-1 | File "/app/modules/monkey_patch_gptq_lora.py", line 8, in <module>
text-generation-webui-text-generation-webui-1 | import autograd_4bit
text-generation-webui-text-generation-webui-1 | ModuleNotFoundError: No module named 'autograd_4bit'
text-generation-webui-text-generation-webui-1 exited with code 1
System Info
Manjaro, docker
RTX 3080 10GB
Have you followed the install steps for monkeypatch listed on the wiki @ https://github.com/oobabooga/text-generation-webui/wiki/GPTQ-models-(4-bit-mode)#using-loras-in-4-bit-mode ?
I (probably wrongly) assumed that this would be covered by the docker build, or anyway that perhaps Oobabooga would want this to work out of the box with the docker version
I checked and the Dockerfile does indeed install GPTQ-for-LLaMa, but from Oobabooga's own repository and not from sterlind
The problem is monkeypatch
breaks things - it would probably need a separate install option for which you want (monkeypatch or normal).
At least until more things get merged upstream.
I see. Thank you!
Sorry for bringing this up again but were you able to solve this? I am having the same issue with the latest pull and I am running this on Ubuntu (WSL) directly without Docker. It's not allowing to train without --monkey-patch flag and getting the above error when adding the flag.
@bgagandeep Have you followed the install steps for monkeypatch listed on the wiki @ https://github.com/oobabooga/text-generation-webui/wiki/GPTQ-models-(4-bit-mode)#using-loras-in-4-bit-mode ?
@bgagandeep No, honestly I gave up for now. I still consider it a bug, since the monkey patch is among the available options of the Web UI but doesn't work. I think the correct course of action would be to find a way to allow fine-tuning of 4-bit quantised models out of the box, but I'm not a developer myself so I just hope that someone working on this can achieve that soon.
@mcmonkey4eva I tried the direction you provided, I removed and redid the steps... but still no luck.
I also tried to search for autograd_4bit.py files on other repos and applied them but did'nt seem to work... tried on linux, wsl as well as windows.
Following is the basic config of my system: GPU: 3080ti CPU: i7-11700k RAM: 32GB
It seems as if those install steps are not meant to be used with Docker. It needs to be included into the Dockerfile. I've created a Dockerfile for myself to make it work. I've uploaded it here.
I'm encountering the same error. Following the steps in the provided link to install monkey-patch did in fact resolve the error related to autograd_4bit. However, after doing so, the gptq_llama module is no longer found:
Traceback (most recent call last): File “C:\text-generation-webui\[server.py](http://server.py/)”, line 68, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name) File “C:\text-generation-webui\modules\[models.py](http://models.py/)”, line 95, in load_model output = load_func(model_name) File “C:\text-generation-webui\modules\[models.py](http://models.py/)”, line 267, in GPTQ_loader from modules.monkey_patch_gptq_lora import load_model_llama File “C:\text-generation-webui\modules\monkey_patch_gptq_lora.py”, line 8, in import autograd_4bit File “C:\text-generation-webui\repositories\alpaca_lora_4bit\autograd_4bit.py”, line 1, in import matmul_utils_4bit as mm4b File “C:\text-generation-webui\repositories\alpaca_lora_4bit\matmul_utils_4bit.py”, line 3, in from gptq_llama import quant_cuda ModuleNotFoundError: No module named ‘gptq_llama
GPU: RTX 4090 CPU: i9-19000 RAM: 64GB OS: Windows
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.