text-generation-webui
text-generation-webui copied to clipboard
warn("The installed version of bitsandbytes was compiled without GPU support. " - On Linux even tough I have a GPU
Describe the bug
I installed it on my Linux Mint but when I ran: conda activate textgen cd text-generation-webui python server.py
it give me this: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " Running on local URL:
I can click in the local URL and it opens on my browser, but when I select the pygmalion model it give me this error:
Traceback (most recent call last): File “/home/murillo/text-generation-webui/server.py”, line 84, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name) File “/home/murillo/text-generation-webui/modules/models.py”, line 101, in load_model from modules.GPTQ_loader import load_quantized File “/home/murillo/text-generation-webui/modules/GPTQ_loader.py”, line 14, in import llama_inference_offload ModuleNotFoundError: No module named ‘llama_inference_offload’
Is there an existing issue for this?
- [X] I have searched the existing issues
Reproduction
run:
conda activate textgen cd text-generation-webui python server.py
and select the pygmalion model
Screenshot
No response
Logs
Traceback (most recent call last):
File “/home/murillo/text-generation-webui/server.py”, line 84, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name)
File “/home/murillo/text-generation-webui/modules/models.py”, line 101, in load_model
from modules.GPTQ_loader import load_quantized
File “/home/murillo/text-generation-webui/modules/GPTQ_loader.py”, line 14, in
import llama_inference_offload
ModuleNotFoundError: No module named ‘llama_inference_offload’
System Info
Distro: Linux Mint 21.1 Vera
base: Ubuntu 22.04 jammy
CPU:
Info: quad core model: Intel Core i5-7400 bits: 64 type: MCP
Graphics:
Device-1: NVIDIA GP106 [GeForce GTX 1060 6GB]
driver: nvidia v: 525.105.17
Related to > https://github.com/oobabooga/text-generation-webui/issues/1164
Only thing worked was >
pip install -i https://test.pypi.org/simple/ bitsandbytes-cuda113
Related to > #1164
already took a look at it and didn't fixed it
Only thing worked was >
pip install -i https://test.pypi.org/simple/ bitsandbytes-cuda113
got this error now:
conda activate textgen
cd text-generation-webui
python server.py
Device with CUDA capability of 6 not supported for 8-bit matmul. Device has no tensor cores!
Traceback (most recent call last):
File "/home/murillo/text-generation-webui/server.py", line 27, in
What's your video-card?
What's your video-card?
nvidia 1060 6gb
Pretty sure that's not enough to run most models. I could be wrong.
On Fri, Apr 14, 2023 at 8:03 PM Murillo Daniel @.***> wrote:
What's your video-card?
nvidia 1060 6gb
— Reply to this email directly, view it on GitHub https://github.com/oobabooga/text-generation-webui/issues/1205#issuecomment-1509407918, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE2TIUG56Y24EL2POVLSXVLXBHQVNANCNFSM6AAAAAAW7AJ2QE . You are receiving this because you commented.Message ID: @.***>
-- Vladimir Druts Growth Marketer | Founder 🍄🌿 @frwrdskincare https://www.instagram.com/frwrdskincare/ *👉 *Linktree https://linktr.ee/vdruts https://www.linkedin.com/in/vdruts https://www.facebook.com/vdruts https://www.instagram.com/vdruts SCHEDULE A QUICK CALL https://www.calendly.com/vdruts
i also have this problem, but i have installed the rocm version
solved: remove the installed bitsandbytes (if any) then build/install the rocm fork
pip install -i https://test.pypi.org/simple/ bitsandbytes-cuda113
fixed for me on linux, thank you
Only thing worked was >
pip install -i https://test.pypi.org/simple/ bitsandbytes-cuda113
I did this on windows without seeing it was for Linux. Getting an error now. Working on resolving it.
Hi,
I had this problem, with installation on linux, for the version 58c8ab4c7aaccc50f507fd08cce941976affe5e0 (git checkout 58c8ab4c7aaccc50f507fd08cce941976affe5e0), that is working for the model vicuna-13B-1.1-GPTQ-4bit-128g.
The solution was twofold:
- first, I recompiled the bitsandbytes lib for my exact CUDA version (12.1) and installed it.
- second, I installed the cuda-toolkit package from conda (conda install -c nvidia cuda-toolkit)
To recompile bitsandbytes :
cd repositories git clone https://github.com/timdettmers/bitsandbytes.git cd bitsandbytes CUDA_VERSION=121 make cuda12x python setup.py install
Obvioulsy you will need to use your own cuda version (nvcc --version), the 3 first digits only. If your version starts with 11 -> make cuda11x
Best regards,
Do you need to remove the prior bitsandbytes version before doing this recompiling? Is there a command that does that or I have to just go in and delete all relevant directories? I am using WSL ubuntu
No you don't need to remove it manually, just install on top
thanks Im trying this rn , its compiling something alright. I dont wanna use my CPU on my linux PC for these tasks. Ive only been using llama.cpp on CPU on my phone , works great, slow, and crashes sometimes.
All of a sudden I'm getting this error again :/ and old fixes not working.
(textgen) perplexity@Perplexity:~/text-generation-webui$ python server.py --chat --listen --disk --gpu-memory 22 --groupsize 128 --model_type llama INFO:Gradio HTTP request redirected to localhost :) bin /home/perplexity/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so /home/perplexity/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " INFO:Loading the extension "gallery"... Running on local URL: http://0.0.0.0:8989
cheers @pi-infected drived me nuts, worked like a charm :)
Same problem on M1 Mac, anyone have a fix?
I ended up using exllama as a ML model execution engine, as I was unable to make it work with latest updates of ooba. Exllama works right off the bat on my computer. I use it without GUI so it can be a solution if you don't need one and just use it in your programs
i'm with this error now
All of a sudden I'm getting this error again :/ and old fixes not working.
(textgen) perplexity@Perplexity:~/text-generation-webui$ python server.py --chat --listen --disk --gpu-memory 22 --groupsize 128 --model_type llama INFO:Gradio HTTP request redirected to localhost :) bin /home/perplexity/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so /home/perplexity/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " INFO:Loading the extension "gallery"... Running on local URL: http://0.0.0.0:8989
https://github.com/TimDettmers/bitsandbytes/issues/112#issuecomment-1763734528
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.