text-generation-webui
text-generation-webui copied to clipboard
Any idea why Linux is 5x faster than Windows?
I have a 3090 and 32gb vram, using Windows 11 and Ubuntu Virtual machine (WSL)
With the Windows 11 install I get maybe 3 words per second at best (7B) while on linux the text just pumps out (more than 15 words per second) I am running both without 8bit or 4bit, why is the speed so different?
Ubuntu Virtual machine
Lots of performance is lost for this reason.
But that would make the Linux version slower not faster.
Ubuntu Virtual machine
Lots of performance is lost for this reason.
But even running it on VM is so much faster, i cant compare them..
But that would make the Linux version slower not faster.
No, Just try installing Llama 7b on ubuntu vm and see, its like 5 times faster, you can get a full page of text within 30 seconds.
I tried now to fully reset the windows installation and using my main SSD, still significantly slower than linux.
It's a WSL issue.
They said the windows 11 version performs well on "some" tasks. But they use a lot of weasel words to try to sell it.
I/O tasks are also slower and this all uses a LOT of I/O.
As long as you are really getting the cuda versions of pytorch and all of that I don't think there is a "fix".
It's a WSL issue.
They said the windows 11 version performs well on "some" tasks. But they use a lot of weasel words to try to sell it.
I/O tasks are also slower and this all uses a LOT of I/O.
As long as you are really getting the cuda versions of pytorch and all of that I don't think there is a "fix".
The WSL Virtual box preforms much better than native Windows11, same hardware so settings not the other way around
There is no "native" windows 11 unless you use directML or something.
When I search there are a ton of unanswered questions from people complaining about cuda pytorch having lower performance but no answers.
There is no "native" windows 11 unless you use directML or something.
When I search there are a ton of unanswered questions from people complaining about cuda pytorch having lower performance but no answers.
By native I mean that my main installation is Windows11, the Linux tests I did were using WSL. My thoughts was that the VM runs will be slower, but they were much faster than just running it on windows. maybe its up to the system utilization or waiting for new updates
So, on that note...
(textgen) ➜ text-generation-webui git:(main) ✗ python server.py --model llama-13b-hf --load-in-8bit --listen-port 7862 --no-stream
(snip)
Output generated in 47.93 seconds (0.52 it/s, 200 tokens)
Output generated in 45.92 seconds (0.54 it/s, 200 tokens)
Is this slow? Or extremely slow? I don't know what to expect, really.
It's ok speed. Not great, not terrible.
I've always found WSL Linux applications perform far better than Windows native. Not sure why...
The difference from running it on Windows vs Linux is night and day for me. Linux does it a lot faster!
How hard is it to set up WSL? Should I be telling people to use WSL?
How hard is it to set up WSL? Should I be telling people to use WSL?
Just enable 3 windows features, restart and download ubuntu from the microsoft store, within seconds you have a functional ubuntu install that you can run webui in, the speed difference is significant!
I feel like almost all developers that use Windows now use WSL. It's become pretty essential as it's the easiest, simplest, and most supported way to run linux cli applications on Windows. Installing and using it isn't a challenge at all for developers, but it has it's quirks and corner cases. Like requiring VT-X enablement, pytorch not supporting ROCm, and (I believe) reliance on directml to make cuda work on the backend. Personally If I was running this in windows I'd just use the cpu install instructions in wsl to do it, but I am a developer so I'm not sure if that would be good advice for non developers.
I've always found WSL Linux applications perform far better than Windows native. Not sure why...
Probably for the same reasons games run better on Windows. Server and compute applications really tend to view Windows as a second class citizen and the closer you get to the primary supported environment the better the experience as a rule. It's also plausible MS may be under-investing or miss-investing in parts of their stack.
How hard is it to set up WSL? Should I be telling people to use WSL?
As a non-developer, I found it far easier to just install via WSL when setting up LLaMA 13-B 4-bit. I spent hours wrestling with various errors and dependency compatibility issues only to get it all fully installed and running within minutes via WSL Ubuntu.
I have added a mention to WSL in the README:
This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.
@iChristGit i'm facing the opposite problem :
https://github.com/oobabooga/text-generation-webui/issues/2607