text-generation-webui
text-generation-webui copied to clipboard
Bitsandbytes on windows
fixes issue: #1893 (that i know of)
As the title says, makes bitsandbytes work on windows. based on this implementation in sd_dreambooth_extension
Implemented through monkeypatching, With a setup function that gets called when starting the server.py. It immediately returns if not on windows. As monkeypatching on Linux isn't needed, as Linux is natively supported by bitsandbytes.
I made sure to monkeypatch anything that could be needed. So mainly the replaced Functions and Class, but also some things from bitsandbytes.autograd._functions into bitsandbytes. As the __init__.py is missing this.
Also contains a fix for the --trust-remote-code not working in 8bit mode and some other situations (327588d, 70ac3f5, 173c9c6) Which i forgot in my previous PR, and have noticed as the bigcode/santacoder model wouldn't load in 8bit, because of trust_remote_code being set to False, and not to the trust_remote_code variable.
requirements.txt uses bitsandbytes==0.37.2 on windows. as newer versions seemed to bug out. A fix i found online was to use this version (0.37.2).
Also uses the right paths, as can be seen in 2cb91a1
My tests
I have tested it with CUDA (torch-2.0.0+cu117) and without CUDA (plain torch). Tested in a venv environment. But will also work in conda environments, since there's no venv specific features used.
I have also noticed a massive speed increase in running 6B parameter models, almost 2x due to it being able to fully load into my 12GB of VRAM. I have also noticed a speed decrease on some models. But not on CPU. (Does bitsandbytes even do anything on CPU?)
TLDR
- Adds bitsandbytes support to Windows, no effects on other OS.
- Fixes missing
trust_remote_codein models.py. - Bitsandbytes supports both CPU and CUDA, doesn't crash on either. (since it picks the .dll accordingly)
Basically just that. Took me 6 hours though, shouldn't have been that long. But it was definitely worth it. going from 2.5t/s to 4.1t/s on 6b parameter models is quite an improvement.
Oh, also, it allowed me to do LoRA training on GPU on a 6B parameter GPT-J model with a 12GB vram gpu
The one click installer already includes a wheel for bitsandbytes on windows:
python -m pip install https://github.com/jllllll/bitsandbytes-windows-webui/raw/main/bitsandbytes-0.38.1-py3-none-any.whl
I don't think it's good practice to include a third-party library as part of the repository
i made this because the one click installer wasn't working for me. conda wasn't recognized as a command. so i had to manually install in a venv. which is how i've been running it.