LaaZa
LaaZa
This might be an issue with Windows 7. It's not supported. You could debug your torch version but it is not really webui issue due to Win7. Well technically pytorch...
I don't think it has been specifically updated what would have made it incompatible. Or are you saying that it used to work for you?
How did you update? Something is up with your pytorch. https://pytorch.org/get-started/locally/ after `pip install` add -U to reinstall.
They should be in webui.py line 144 ```python def run_model(): os.chdir("text-generation-webui") run_cmd("python server.py --auto-devices --chat --listen") ```
Put them in webui.py at line 146 `run_cmd("python server.py --load-in-8bit --chat --listen")`
@oobabooga can we have an env variable or something in the launch script instead of editing a random line in a python file? I know this is mostly for the...
My performance with 7B quantized (in textgen) is at most 20 tokens/s on a 3080. (cuda 11.8, WSL)
> > My performance with 7B quantized (in textgen) is at most 20 tokens/s on a 3080. (cuda 11.8, WSL) > > But quantized is much slower than fp16 (and...
I think the bad performance is the most common for people on their own systems too. Also using textgen I get about 18 tokens/s with AutoGPTQ "old cuda" + fused_attn...
Run your test in my WSL, 115 t/s (about 90 on windows(11) directly). My friend ran it on windows(10) with 1070 and got 139 t/s. cuda 11.8 for both. I...