text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

warn("The installed version of bitsandbytes was compiled without GPU support. " - On Linux even tough I have a GPU

Open MurilloDaniel opened this issue 1 year ago • 14 comments

Describe the bug

I installed it on my Linux Mint but when I ran: conda activate textgen cd text-generation-webui python server.py

it give me this: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " Running on local URL:

I can click in the local URL and it opens on my browser, but when I select the pygmalion model it give me this error:

Traceback (most recent call last): File “/home/murillo/text-generation-webui/server.py”, line 84, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name) File “/home/murillo/text-generation-webui/modules/models.py”, line 101, in load_model from modules.GPTQ_loader import load_quantized File “/home/murillo/text-generation-webui/modules/GPTQ_loader.py”, line 14, in import llama_inference_offload ModuleNotFoundError: No module named ‘llama_inference_offload’

Is there an existing issue for this?

  • [X] I have searched the existing issues

Reproduction

run:

conda activate textgen cd text-generation-webui python server.py

and select the pygmalion model

Screenshot

No response

Logs

Traceback (most recent call last):
File “/home/murillo/text-generation-webui/server.py”, line 84, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name)
File “/home/murillo/text-generation-webui/modules/models.py”, line 101, in load_model
from modules.GPTQ_loader import load_quantized
File “/home/murillo/text-generation-webui/modules/GPTQ_loader.py”, line 14, in
import llama_inference_offload
ModuleNotFoundError: No module named ‘llama_inference_offload’

System Info

Distro: Linux Mint 21.1 Vera
    base: Ubuntu 22.04 jammy
CPU:
  Info: quad core model: Intel Core i5-7400 bits: 64 type: MCP
Graphics:
  Device-1: NVIDIA GP106 [GeForce GTX 1060 6GB]
driver: nvidia v: 525.105.17

MurilloDaniel avatar Apr 14 '23 23:04 MurilloDaniel

Related to > https://github.com/oobabooga/text-generation-webui/issues/1164

bbecausereasonss avatar Apr 14 '23 23:04 bbecausereasonss

Only thing worked was >

pip install -i https://test.pypi.org/simple/ bitsandbytes-cuda113

bbecausereasonss avatar Apr 14 '23 23:04 bbecausereasonss

Related to > #1164

already took a look at it and didn't fixed it

MurilloDaniel avatar Apr 14 '23 23:04 MurilloDaniel

Only thing worked was >

pip install -i https://test.pypi.org/simple/ bitsandbytes-cuda113

got this error now: conda activate textgen cd text-generation-webui python server.py Device with CUDA capability of 6 not supported for 8-bit matmul. Device has no tensor cores! Traceback (most recent call last): File "/home/murillo/text-generation-webui/server.py", line 27, in from modules import api, chat, shared, training, ui File "/home/murillo/text-generation-webui/modules/training.py", line 12, in from peft import (LoraConfig, get_peft_model, get_peft_model_state_dict, File "/home/murillo/miniconda3/envs/textgen/lib/python3.10/site-packages/peft/init.py", line 22, in from .mapping import MODEL_TYPE_TO_PEFT_MODEL_MAPPING, PEFT_TYPE_TO_CONFIG_MAPPING, get_peft_config, get_peft_model File "/home/murillo/miniconda3/envs/textgen/lib/python3.10/site-packages/peft/mapping.py", line 16, in from .peft_model import ( File "/home/murillo/miniconda3/envs/textgen/lib/python3.10/site-packages/peft/peft_model.py", line 31, in from .tuners import AdaLoraModel, LoraModel, PrefixEncoder, PromptEmbedding, PromptEncoder File "/home/murillo/miniconda3/envs/textgen/lib/python3.10/site-packages/peft/tuners/init.py", line 20, in from .lora import LoraConfig, LoraModel File "/home/murillo/miniconda3/envs/textgen/lib/python3.10/site-packages/peft/tuners/lora.py", line 39, in import bitsandbytes as bnb File "/home/murillo/.local/lib/python3.10/site-packages/bitsandbytes/init.py", line 7, in from .autograd._functions import mm_cublas, bmm_cublas, matmul_cublas, matmul, MatmulLtState File "/home/murillo/.local/lib/python3.10/site-packages/bitsandbytes/autograd/_functions.py", line 129, in class MatmulLtState: File "/home/murillo/.local/lib/python3.10/site-packages/bitsandbytes/autograd/_functions.py", line 148, in MatmulLtState formatB = F.get_special_format_str() File "/home/murillo/.local/lib/python3.10/site-packages/bitsandbytes/functional.py", line 1163, in get_special_format_str assert major >= 7 AssertionError (textgen) murillo@Murillo-System:~/text-generation-webui$

MurilloDaniel avatar Apr 14 '23 23:04 MurilloDaniel

What's your video-card?

bbecausereasonss avatar Apr 15 '23 00:04 bbecausereasonss

What's your video-card?

nvidia 1060 6gb

MurilloDaniel avatar Apr 15 '23 00:04 MurilloDaniel

Pretty sure that's not enough to run most models. I could be wrong.

On Fri, Apr 14, 2023 at 8:03 PM Murillo Daniel @.***> wrote:

What's your video-card?

nvidia 1060 6gb

— Reply to this email directly, view it on GitHub https://github.com/oobabooga/text-generation-webui/issues/1205#issuecomment-1509407918, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE2TIUG56Y24EL2POVLSXVLXBHQVNANCNFSM6AAAAAAW7AJ2QE . You are receiving this because you commented.Message ID: @.***>

-- Vladimir Druts Growth Marketer | Founder 🍄🌿 @frwrdskincare https://www.instagram.com/frwrdskincare/ *👉 *Linktree https://linktr.ee/vdruts https://www.linkedin.com/in/vdruts https://www.facebook.com/vdruts https://www.instagram.com/vdruts SCHEDULE A QUICK CALL https://www.calendly.com/vdruts

bbecausereasonss avatar Apr 15 '23 00:04 bbecausereasonss

i also have this problem, but i have installed the rocm version

solved: remove the installed bitsandbytes (if any) then build/install the rocm fork

CristianPi avatar Apr 15 '23 02:04 CristianPi

pip install -i https://test.pypi.org/simple/ bitsandbytes-cuda113

fixed for me on linux, thank you

kernelzeroday avatar Apr 20 '23 16:04 kernelzeroday

Only thing worked was >

pip install -i https://test.pypi.org/simple/ bitsandbytes-cuda113

I did this on windows without seeing it was for Linux. Getting an error now. Working on resolving it.

OO-ooo-OO avatar Apr 22 '23 23:04 OO-ooo-OO

Hi,

I had this problem, with installation on linux, for the version 58c8ab4c7aaccc50f507fd08cce941976affe5e0 (git checkout 58c8ab4c7aaccc50f507fd08cce941976affe5e0), that is working for the model vicuna-13B-1.1-GPTQ-4bit-128g.

The solution was twofold:

  • first, I recompiled the bitsandbytes lib for my exact CUDA version (12.1) and installed it.
  • second, I installed the cuda-toolkit package from conda (conda install -c nvidia cuda-toolkit)

To recompile bitsandbytes :

cd repositories git clone https://github.com/timdettmers/bitsandbytes.git cd bitsandbytes CUDA_VERSION=121 make cuda12x python setup.py install

Obvioulsy you will need to use your own cuda version (nvcc --version), the 3 first digits only. If your version starts with 11 -> make cuda11x

Best regards,

pi-infected avatar Apr 25 '23 17:04 pi-infected

Do you need to remove the prior bitsandbytes version before doing this recompiling? Is there a command that does that or I have to just go in and delete all relevant directories? I am using WSL ubuntu

aleatorydialogue avatar Apr 27 '23 14:04 aleatorydialogue

No you don't need to remove it manually, just install on top

pi-infected avatar Apr 28 '23 10:04 pi-infected

thanks Im trying this rn , its compiling something alright. I dont wanna use my CPU on my linux PC for these tasks. Ive only been using llama.cpp on CPU on my phone , works great, slow, and crashes sometimes.

ProfessorSparrs avatar May 08 '23 21:05 ProfessorSparrs

All of a sudden I'm getting this error again :/ and old fixes not working.

(textgen) perplexity@Perplexity:~/text-generation-webui$ python server.py --chat --listen --disk --gpu-memory 22 --groupsize 128 --model_type llama INFO:Gradio HTTP request redirected to localhost :) bin /home/perplexity/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so /home/perplexity/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " INFO:Loading the extension "gallery"... Running on local URL: http://0.0.0.0:8989

bbecausereasonss avatar May 24 '23 14:05 bbecausereasonss

cheers @pi-infected drived me nuts, worked like a charm :)

fkSoc1ety avatar Jul 08 '23 10:07 fkSoc1ety

Same problem on M1 Mac, anyone have a fix?

sij-ai avatar Jul 30 '23 00:07 sij-ai

I ended up using exllama as a ML model execution engine, as I was unable to make it work with latest updates of ooba. Exllama works right off the bat on my computer. I use it without GUI so it can be a solution if you don't need one and just use it in your programs

pi-infected avatar Jul 31 '23 08:07 pi-infected

i'm with this error now

All of a sudden I'm getting this error again :/ and old fixes not working.

(textgen) perplexity@Perplexity:~/text-generation-webui$ python server.py --chat --listen --disk --gpu-memory 22 --groupsize 128 --model_type llama INFO:Gradio HTTP request redirected to localhost :) bin /home/perplexity/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so /home/perplexity/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " INFO:Loading the extension "gallery"... Running on local URL: http://0.0.0.0:8989

nephi-dev avatar Sep 16 '23 03:09 nephi-dev

https://github.com/TimDettmers/bitsandbytes/issues/112#issuecomment-1763734528

vgudavarthi avatar Oct 16 '23 05:10 vgudavarthi

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

github-actions[bot] avatar Nov 27 '23 23:11 github-actions[bot]