gpt4all
gpt4all copied to clipboard
GPT4all not using my GPU because Models not unloading from VRAM when switching
Issue you'd like to raise.
RTX 3060 12 GB is available as a selection, but queries are run through the cpu and are very slow.
Cuda compilation tools, release 12.2, V12.2.128 Build cuda_12.2.r12.2/compiler.33053471_0
Suggestion:
No response
It was a VRAM issue. HOWEVER, it is because changing models in the GUI does not always unload the model from GPU RAM..change a few times between models, and boom up to 12 Gb. It may be specific to switching to and from the models I got from the bloke on huggingface
No its because it seems custom models dont use GPU. I thought I was going insane till I tried downloading gtp4falcon and it used my gpu. Switching back to my huggingface models all use cpu.
Switching back to my huggingface models all use cpu.
The Vulkan backend only supports Q4_0 and Q4_1 quantizations currently, and Q4_1 is not recommended for LLaMA-2 based models.