text-generation-webui Any chance we could get compatibility w/ the Nvidia Maxwell generation of GPUs? I've got a Tesla M40 24G card and I can't run any of the models...

Any chance we could get compatibility w/ the Nvidia Maxwell generation of GPUs? I've got a Tesla M40 24G card and I can't run any of the models...

Open scumola opened this issue 1 year ago • 1 comments

Description

Any chance we could get compatibility w/ the Nvidia Maxwell generation of GPUs? I've got a Tesla M40 24G card and I can't run any of the models...

May 09 '23 19:05 scumola

the tesla P40 (P as in peter) does alright with oobabooga but when i try to use it with --load-in-8bit i get errors about not supporting 8bit mat mul or something. this is interesting because documentation says it does support those features.

May 09 '23 21:05 cameronbergh

I've spent a good bit of two days battling 'this mismatched library' or 'those outdated directions' with my own M40 24GB only to stall here; at least this is a definitive answer, and an issue to watch.

May 27 '23 18:05 Calamaroo

As of my knowledge the m40 has a different type of architecture (Maxwell) and may be missing some features/instructions to run properly. there is someone who seems to have solved it by using a older version of the software. but I am yet to make this run on my m40. The link is here:

https://github.com/danmincu/text-generation-webui-m40

Maybe someone can tell me if you get it working? please let me know as well. I've sped too much time on this subject and feel like it is hard to fix.

Jul 01 '23 15:07 TheBestIsComing

danmincu/text-generation-webui-m40

That repo is based on a version of GPTQ from March (see here), which AFAIK is incompatible with any newer GPTQ quantizations due to breaking changes made in early April (see here). It will fail to load newer quants with a KeyError exception. I haven't even gotten the latest qwopqwop200/cuda branch to work with newer quants for some reason.

Jul 01 '23 16:07 cebtenzzre

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

Aug 16 '23 23:08 github-actions[bot]

I managed to get the m40 to work with the latest branch at the time of writing this therefore my fork is obsolete now. From a birdseye view these are the steps

a) install 22.04.03 TLS (not desktop) b) remove the nvidia driver(s) and install them with cuda 11.7 e.g.

sudo apt-add-repository -r ppa:graphics-drivers/ppa
sudo apt update
sudo apt remove nvidia*
sudo apt autoremove

#install the following (including the nvidia drivers)

wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
sudo sh cuda_11.7.0_515.43.04_linux.run

c) git clone this repo and run start_linux.sh and choose NOT to use the latest cuda but use 11.8 instead <- this was very important !!! d) run this to get the proper prompt >>> ./cmd_linux.sh e) run this to download a sample model >>> python download-model.py anon8231489123/vicuna-13b-GPTQ-4bit-128g f) run ./start_linux.sh --listen in the UI choose the right model + wbits + groupsize like the picture to coincide with the sample model You should be able to get it working at this point!

Nov 06 '23 18:11 danmincu

text-generation-webui text-generation-webui copied to clipboard

Any chance we could get compatibility w/ the Nvidia Maxwell generation of GPUs? I've got a Tesla M40 24G card and I can't run any of the models...

text-generation-webui
text-generation-webui copied to clipboard