text-generation-webui
text-generation-webui copied to clipboard
ExLLamaV2 loader doesn't work on WSL
Describe the bug
Today I updated to the latest commit (771c59290a72464a1666a1fd5971ccd9c96e4e11) and run pip install -r requirements.txt --upgrade and am getting this error message. Previously I had gotten another similar one and had solved it with the steps here https://github.com/oobabooga/text-generation-webui/issues/5408#issuecomment-1936180692
But this didn't work now
Is there an existing issue for this?
- [X] I have searched the existing issues
Reproduction
I run
python server.py --auto-devices --listen --listen-port 7860 --api --model LoneStriker_miqu-1-70b-sf-2.65bpw-h6-exl2
Screenshot
No response
Logs
Traceback (most recent call last) ──────────────────────────────────────────╮│ /home/myuser/ooba/server.py:241 in <module> ││ ││ 240 # Load the model ││ ❱ 241 shared.model, shared.tokenizer = load_model(model_name) ││ 242 if shared.args.lora: ││ ││ /home/myuser/ooba/modules/models.py:87 in load_model ││ ││ 86 shared.args.loader = loader ││ ❱ 87 output = load_func_map[loader](model_name) ││ 88 if type(output) is tuple: ││ ││ /home/myuser/ooba/modules/models.py:378 in ExLlamav2_HF_loader ││ ││ 377 def ExLlamav2_HF_loader(model_name): ││ ❱ 378 from modules.exllamav2_hf import Exllamav2HF ││ 379 ││ ││ /home/myuser/ooba/modules/exllamav2_hf.py:7 in <module> ││ ││ 6 import torch ││ ❱ 7 from exllamav2 import ( ││ 8 ExLlamaV2, ││ ││ /home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2/__init__.py:3 in <module> ││ ││ 2 ││ ❱ 3 from exllamav2.model import ExLlamaV2 ││ 4 from exllamav2.cache import ExLlamaV2CacheBase ││ ││ /home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2/model.py:16 in <module> ││ ││ 15 import math ││ ❱ 16 from exllamav2.config import ExLlamaV2Config ││ 17 from exllamav2.cache import ExLlamaV2CacheBase ││ ││ /home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2/config.py:2 in <module> ││ ││ 1 import torch ││ ❱ 2 from exllamav2.fasttensors import STFile ││ 3 import os, glob, json ││ ││ /home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2/fasttensors.py:5 in <module> ││ ││ 4 import json ││ ❱ 5 from exllamav2.ext import exllamav2_ext as ext_c ││ 6 import os ││ ││ /home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2/ext.py:15 in <module> ││ ││ 14 try: ││ ❱ 15 import exllamav2_ext ││ 16 except ModuleNotFoundError: │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ImportError:
/home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2_ext.cpython-311-x86_64-linux-gnu.so:
undefined symbol: _ZN3c107WarningC1ENS_7variantIJNS0_11UserWarningENS0_18DeprecationWarningEEEERKNS_14SourceLocationESsb
System Info
Windows 11
WSL (Ubuntu)
Nvidia 4090
64 GB RAM
In case someone wants to solve it manually, I did it by installing exllamav2 and flash-attn:
pip install https://github.com/turboderp/exllamav2/releases/download/0.0.13.post2/exllamav2-0.0.13.post2+cu121-cp311-cp311-linux_x86_64.whl
pip install flash_attn transformers -U
Thanks, @Barnplaid. That did the trick for me too.
Hit this in the nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04 docker container as well; from the look of things I'd guess that the ExLlamaV2 version in requirements.txt was built using a newer version of glibc than the one that's in ubuntu focal, so it won't run.
This issue has been closed due to inactivity for 2 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.