gpt4all GPT4all not using my GPU because Models not unloading from VRAM when switching

GPT4all not using my GPU because Models not unloading from VRAM when switching

Open nimzodisaster opened this issue 1 year ago • 3 comments

Issue you'd like to raise.

RTX 3060 12 GB is available as a selection, but queries are run through the cpu and are very slow.

Cuda compilation tools, release 12.2, V12.2.128 Build cuda_12.2.r12.2/compiler.33053471_0

Suggestion:

No response

Nov 28 '23 23:11 nimzodisaster

It was a VRAM issue. HOWEVER, it is because changing models in the GUI does not always unload the model from GPU RAM..change a few times between models, and boom up to 12 Gb. It may be specific to switching to and from the models I got from the bloke on huggingface

Nov 29 '23 00:11 nimzodisaster

No its because it seems custom models dont use GPU. I thought I was going insane till I tried downloading gtp4falcon and it used my gpu. Switching back to my huggingface models all use cpu.

Nov 30 '23 06:11 nanafy

Switching back to my huggingface models all use cpu.

The Vulkan backend only supports Q4_0 and Q4_1 quantizations currently, and Q4_1 is not recommended for LLaMA-2 based models.

Dec 05 '23 18:12 cebtenzzre

gpt4all gpt4all copied to clipboard

GPT4all not using my GPU because Models not unloading from VRAM when switching

Issue you'd like to raise.

Suggestion:

gpt4all
gpt4all copied to clipboard