anything-llm
anything-llm copied to clipboard
[BUG]: LLAMA3 not using GPU
How are you running AnythingLLM?
AnythingLLM desktop app
What happened?
Hello, everyone! I installed the desktop app today (Windows) and runned it using llama3 7b. The question is that the model isn't using my GPU, but my CPU and I can't find the reason why. I already setted the executable to run on GPU in Nvidia Control Panel, but it didn't worked. My GPU is Nvidia GeForce RTX 3050.
Are there known steps to reproduce?
No response
The built in LLM engine for AnythingLLM is actually Ollama. Do you have Ollama on the same machine and does ollama outside of AnythingLLM use GPU?
If not, I'm thinking you may not have CUDA installed and the bindings are not being found to load and run models with GPU-layers
Title
I have no other Ollama instances and I have CUDA installed!
This seems like something Ollama needs to work on and not something we can manipulate directly via the built-in https://github.com/ollama/ollama/issues/3201
It may be worth installing Ollama separately and using that as your LLM to fully leverage the GPU since it seems there is some kind of issues with that card/CUDA combination for native pickup. I'm on an NVIDIA 4090 and it finds my GPU and offloads accordingly
This seems like something Ollama needs to work on and not something we can manipulate directly via the built-in ollama/ollama#3201
It may be worth installing Ollama separately and using that as your LLM to fully leverage the GPU since it seems there is some kind of issues with that card/CUDA combination for native pickup. I'm on an NVIDIA 4090 and it finds my GPU and offloads accordingly
I did installed it separately and it seems that it's using my GPU now and acting faster. Good thing also since I will also integrate the Ollama with other softwares, it was the right decision.
Thanks, timothy.
@kalilfagundes No problem, im sorry you came across that. The native GPU stuff can be tricky and our built-in makes things easy to get going but is certainly not foolproof, but there is always a way forward while staying local
Same problem here, on Windows 11. Ollama installed on WSL2 uses my Nvidia GPU, but AnythingLLM desktop app doesn't. Tried llama3 and mistral, same outcome. All software is up-to-date. Nvidia GTX 1070, Ryzen 5600x.
@Nododot - the issue here is almost certainly being on WSL. If you use the Ollama windows app on the host machine outside WSL it will bind to GPUs, otherwise raise an issue on Ollama's repo as that would be out of scope for us anyway on this repo
anything-llm will use GPU for ollama just adding C:\Users\<user-name>\AppData\Local\Programs\AnythingLLM\resources\ollama
to the TOP of environment variable PATH
.
Is this work as expected? High CPU high RAM high VRAM usage, but low GPU usage. Is it working on GPU?
I use this model.
And had add C:\Users\<user-name>\AppData\Local\Programs\AnythingLLM\resources\ollama
to PATH
.