text-generation-webui Monkey patch gives missing module error in docker version

Describe the bug

On a freshly rebuilt container, if I try loading with --monkey-patch I get this:

No module named 'autograd_4bit'

Perhaps a missing requirement?

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

git pull appended --monkey-patch to CLI_ARGS docker-compose up --build

Screenshot

autograd

Logs

[+] Running 1/0
 ✔ Container text-generation-webui-text-generation-webui-1  Recreated                                                                                                                                                                    0.0s 
Attaching to text-generation-webui-text-generation-webui-1
text-generation-webui-text-generation-webui-1  | 
text-generation-webui-text-generation-webui-1  | ==========
text-generation-webui-text-generation-webui-1  | == CUDA ==
text-generation-webui-text-generation-webui-1  | ==========
text-generation-webui-text-generation-webui-1  | 
text-generation-webui-text-generation-webui-1  | CUDA Version 11.8.0
text-generation-webui-text-generation-webui-1  | 
text-generation-webui-text-generation-webui-1  | Container image Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
text-generation-webui-text-generation-webui-1  | 
text-generation-webui-text-generation-webui-1  | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
text-generation-webui-text-generation-webui-1  | By pulling and using the container, you accept the terms and conditions of this license:
text-generation-webui-text-generation-webui-1  | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
text-generation-webui-text-generation-webui-1  | 
text-generation-webui-text-generation-webui-1  | A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
text-generation-webui-text-generation-webui-1  | 
text-generation-webui-text-generation-webui-1  | Gradio HTTP request redirected to localhost :)
text-generation-webui-text-generation-webui-1  | bin /app/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
text-generation-webui-text-generation-webui-1  | Loading settings from settings.json...
text-generation-webui-text-generation-webui-1  | Loading mayaeary_pygmalion-6b_dev-4bit-128g...
text-generation-webui-text-generation-webui-1  | Warning: applying the monkey patch for using LoRAs in 4-bit mode.
text-generation-webui-text-generation-webui-1  | It may cause undefined behavior outside its intended scope.
text-generation-webui-text-generation-webui-1  | Traceback (most recent call last):
text-generation-webui-text-generation-webui-1  |   File "/app/server.py", line 917, in <module>
text-generation-webui-text-generation-webui-1  |     shared.model, shared.tokenizer = load_model(shared.model_name)
text-generation-webui-text-generation-webui-1  |   File "/app/modules/models.py", line 118, in load_model
text-generation-webui-text-generation-webui-1  |     from modules.monkey_patch_gptq_lora import load_model_llama
text-generation-webui-text-generation-webui-1  |   File "/app/modules/monkey_patch_gptq_lora.py", line 8, in <module>
text-generation-webui-text-generation-webui-1  |     import autograd_4bit
text-generation-webui-text-generation-webui-1  | ModuleNotFoundError: No module named 'autograd_4bit'
text-generation-webui-text-generation-webui-1 exited with code 1

System Info

Manjaro, docker
RTX 3080 10GB

Apr 19 '23 23:04 Reezlaw

Have you followed the install steps for monkeypatch listed on the wiki @ https://github.com/oobabooga/text-generation-webui/wiki/GPTQ-models-(4-bit-mode)#using-loras-in-4-bit-mode ?

Apr 19 '23 23:04 mcmonkey4eva

I (probably wrongly) assumed that this would be covered by the docker build, or anyway that perhaps Oobabooga would want this to work out of the box with the docker version

Apr 19 '23 23:04 Reezlaw

I checked and the Dockerfile does indeed install GPTQ-for-LLaMa, but from Oobabooga's own repository and not from sterlind

Apr 20 '23 00:04 Reezlaw

The problem is monkeypatch breaks things - it would probably need a separate install option for which you want (monkeypatch or normal). At least until more things get merged upstream.

Apr 20 '23 00:04 mcmonkey4eva

I see. Thank you!

Apr 20 '23 02:04 Reezlaw

Sorry for bringing this up again but were you able to solve this? I am having the same issue with the latest pull and I am running this on Ubuntu (WSL) directly without Docker. It's not allowing to train without --monkey-patch flag and getting the above error when adding the flag.

Screenshot 2023-05-08 012532

May 07 '23 19:05 bgagandeep

@bgagandeep Have you followed the install steps for monkeypatch listed on the wiki @ https://github.com/oobabooga/text-generation-webui/wiki/GPTQ-models-(4-bit-mode)#using-loras-in-4-bit-mode ?

May 07 '23 21:05 mcmonkey4eva

@bgagandeep No, honestly I gave up for now. I still consider it a bug, since the monkey patch is among the available options of the Web UI but doesn't work. I think the correct course of action would be to find a way to allow fine-tuning of 4-bit quantised models out of the box, but I'm not a developer myself so I just hope that someone working on this can achieve that soon.

May 07 '23 23:05 Reezlaw

@mcmonkey4eva I tried the direction you provided, I removed and redid the steps... but still no luck.

I also tried to search for autograd_4bit.py files on other repos and applied them but did'nt seem to work... tried on linux, wsl as well as windows.

Following is the basic config of my system: GPU: 3080ti CPU: i7-11700k RAM: 32GB

May 09 '23 14:05 bgagandeep

It seems as if those install steps are not meant to be used with Docker. It needs to be included into the Dockerfile. I've created a Dockerfile for myself to make it work. I've uploaded it here.

May 10 '23 10:05 r7l

I'm encountering the same error. Following the steps in the provided link to install monkey-patch did in fact resolve the error related to autograd_4bit. However, after doing so, the gptq_llama module is no longer found:

Traceback (most recent call last): File “C:\text-generation-webui\[server.py](http://server.py/)”, line 68, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name) File “C:\text-generation-webui\modules\[models.py](http://models.py/)”, line 95, in load_model output = load_func(model_name) File “C:\text-generation-webui\modules\[models.py](http://models.py/)”, line 267, in GPTQ_loader from modules.monkey_patch_gptq_lora import load_model_llama File “C:\text-generation-webui\modules\monkey_patch_gptq_lora.py”, line 8, in import autograd_4bit File “C:\text-generation-webui\repositories\alpaca_lora_4bit\autograd_4bit.py”, line 1, in import matmul_utils_4bit as mm4b File “C:\text-generation-webui\repositories\alpaca_lora_4bit\matmul_utils_4bit.py”, line 3, in from gptq_llama import quant_cuda ModuleNotFoundError: No module named ‘gptq_llama

GPU: RTX 4090 CPU: i9-19000 RAM: 64GB OS: Windows

May 21 '23 16:05 WinstonPrivacy

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

Aug 30 '23 23:08 github-actions[bot]

text-generation-webui text-generation-webui copied to clipboard

Monkey patch gives missing module error in docker version

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

text-generation-webui
text-generation-webui copied to clipboard