text-generation-webui Bitsandbytes on windows

fixes issue: #1893 (that i know of)

As the title says, makes bitsandbytes work on windows. based on this implementation in sd_dreambooth_extension

Implemented through monkeypatching, With a setup function that gets called when starting the server.py. It immediately returns if not on windows. As monkeypatching on Linux isn't needed, as Linux is natively supported by bitsandbytes.

I made sure to monkeypatch anything that could be needed. So mainly the replaced Functions and Class, but also some things from bitsandbytes.autograd._functions into bitsandbytes. As the __init__.py is missing this.

Also contains a fix for the --trust-remote-code not working in 8bit mode and some other situations (327588d, 70ac3f5, 173c9c6) Which i forgot in my previous PR, and have noticed as the bigcode/santacoder model wouldn't load in 8bit, because of trust_remote_code being set to False, and not to the trust_remote_code variable.

requirements.txt uses bitsandbytes==0.37.2 on windows. as newer versions seemed to bug out. A fix i found online was to use this version (0.37.2).

Also uses the right paths, as can be seen in 2cb91a1

My tests

I have tested it with CUDA (torch-2.0.0+cu117) and without CUDA (plain torch). Tested in a venv environment. But will also work in conda environments, since there's no venv specific features used.

I have also noticed a massive speed increase in running 6B parameter models, almost 2x due to it being able to fully load into my 12GB of VRAM. I have also noticed a speed decrease on some models. But not on CPU. (Does bitsandbytes even do anything on CPU?)

TLDR

Adds bitsandbytes support to Windows, no effects on other OS.
Fixes missing trust_remote_code in models.py.
Bitsandbytes supports both CPU and CUDA, doesn't crash on either. (since it picks the .dll accordingly)

Basically just that. Took me 6 hours though, shouldn't have been that long. But it was definitely worth it. going from 2.5t/s to 4.1t/s on 6b parameter models is quite an improvement.

May 04 '23 22:05 gitmylo

Oh, also, it allowed me to do LoRA training on GPU on a 6B parameter GPT-J model with a 12GB vram gpu

May 06 '23 11:05 gitmylo

The one click installer already includes a wheel for bitsandbytes on windows:

python -m pip install https://github.com/jllllll/bitsandbytes-windows-webui/raw/main/bitsandbytes-0.38.1-py3-none-any.whl

I don't think it's good practice to include a third-party library as part of the repository

May 09 '23 01:05 oobabooga

i made this because the one click installer wasn't working for me. conda wasn't recognized as a command. so i had to manually install in a venv. which is how i've been running it.

May 09 '23 08:05 gitmylo

text-generation-webui text-generation-webui copied to clipboard

Bitsandbytes on windows

My tests

TLDR

text-generation-webui
text-generation-webui copied to clipboard