text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

ExLLamaV2 loader doesn't work on WSL

Open Barnplaid opened this issue 1 year ago • 1 comments

Describe the bug

Today I updated to the latest commit (771c59290a72464a1666a1fd5971ccd9c96e4e11) and run pip install -r requirements.txt --upgrade and am getting this error message. Previously I had gotten another similar one and had solved it with the steps here https://github.com/oobabooga/text-generation-webui/issues/5408#issuecomment-1936180692

But this didn't work now

Is there an existing issue for this?

  • [X] I have searched the existing issues

Reproduction

I run python server.py --auto-devices --listen --listen-port 7860 --api --model LoneStriker_miqu-1-70b-sf-2.65bpw-h6-exl2

Screenshot

No response

Logs

Traceback (most recent call last) ──────────────────────────────────────────╮│ /home/myuser/ooba/server.py:241 in <module>                                                                           ││                                                                                                                      ││   240         # Load the model                                                                                       ││ ❱ 241         shared.model, shared.tokenizer = load_model(model_name)                                                ││   242         if shared.args.lora:                                                                                   ││                                                                                                                      ││ /home/myuser/ooba/modules/models.py:87 in load_model                                                                  ││                                                                                                                      ││    86     shared.args.loader = loader                                                                                ││ ❱  87     output = load_func_map[loader](model_name)                                                                 ││    88     if type(output) is tuple:                                                                                  ││                                                                                                                      ││ /home/myuser/ooba/modules/models.py:378 in ExLlamav2_HF_loader                                                        ││                                                                                                                      ││   377 def ExLlamav2_HF_loader(model_name):                                                                           ││ ❱ 378     from modules.exllamav2_hf import Exllamav2HF                                                               ││   379                                                                                                                ││                                                                                                                      ││ /home/myuser/ooba/modules/exllamav2_hf.py:7 in <module>                                                               ││                                                                                                                      ││     6 import torch                                                                                                   ││ ❱   7 from exllamav2 import (                                                                                        ││     8     ExLlamaV2,                                                                                                 ││                                                                                                                      ││ /home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2/__init__.py:3 in <module>                ││                                                                                                                      ││    2                                                                                                                 ││ ❱  3 from exllamav2.model import ExLlamaV2                                                                           ││    4 from exllamav2.cache import ExLlamaV2CacheBase                                                                  ││                                                                                                                      ││ /home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2/model.py:16 in <module>                  ││                                                                                                                      ││    15 import math                                                                                                    ││ ❱  16 from exllamav2.config import ExLlamaV2Config                                                                   ││    17 from exllamav2.cache import ExLlamaV2CacheBase                                                                 ││                                                                                                                      ││ /home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2/config.py:2 in <module>                  ││                                                                                                                      ││     1 import torch                                                                                                   ││ ❱   2 from exllamav2.fasttensors import STFile                                                                       ││     3 import os, glob, json                                                                                          ││                                                                                                                      ││ /home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2/fasttensors.py:5 in <module>             ││                                                                                                                      ││     4 import json                                                                                                    ││ ❱   5 from exllamav2.ext import exllamav2_ext as ext_c                                                               ││     6 import os                                                                                                      ││                                                                                                                      ││ /home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2/ext.py:15 in <module>                    ││                                                                                                                      ││    14 try:                                                                                                           ││ ❱  15     import exllamav2_ext                                                                                       ││    16 except ModuleNotFoundError:                                                                                    │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ImportError:
/home/myuser/miniconda3/envs/textgen3/lib/python3.11/site-packages/exllamav2_ext.cpython-311-x86_64-linux-gnu.so:
undefined symbol: _ZN3c107WarningC1ENS_7variantIJNS0_11UserWarningENS0_18DeprecationWarningEEEERKNS_14SourceLocationESsb

System Info

Windows 11
WSL (Ubuntu)
Nvidia 4090
64 GB RAM

Barnplaid avatar Feb 15 '24 19:02 Barnplaid

In case someone wants to solve it manually, I did it by installing exllamav2 and flash-attn:

pip install https://github.com/turboderp/exllamav2/releases/download/0.0.13.post2/exllamav2-0.0.13.post2+cu121-cp311-cp311-linux_x86_64.whl

pip install flash_attn transformers -U

Barnplaid avatar Feb 15 '24 19:02 Barnplaid

Thanks, @Barnplaid. That did the trick for me too.

sophosympatheia avatar Feb 19 '24 01:02 sophosympatheia

Hit this in the nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04 docker container as well; from the look of things I'd guess that the ExLlamaV2 version in requirements.txt was built using a newer version of glibc than the one that's in ubuntu focal, so it won't run.

neggles avatar Feb 26 '24 08:02 neggles

This issue has been closed due to inactivity for 2 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

github-actions[bot] avatar Apr 26 '24 23:04 github-actions[bot]