gpt4all icon indicating copy to clipboard operation
gpt4all copied to clipboard

GPT4all not using my GPU because Models not unloading from VRAM when switching

Open nimzodisaster opened this issue 1 year ago • 3 comments

Issue you'd like to raise.

RTX 3060 12 GB is available as a selection, but queries are run through the cpu and are very slow.

Cuda compilation tools, release 12.2, V12.2.128 Build cuda_12.2.r12.2/compiler.33053471_0

Suggestion:

No response

nimzodisaster avatar Nov 28 '23 23:11 nimzodisaster

It was a VRAM issue. HOWEVER, it is because changing models in the GUI does not always unload the model from GPU RAM..change a few times between models, and boom up to 12 Gb. It may be specific to switching to and from the models I got from the bloke on huggingface

nimzodisaster avatar Nov 29 '23 00:11 nimzodisaster

No its because it seems custom models dont use GPU. I thought I was going insane till I tried downloading gtp4falcon and it used my gpu. Switching back to my huggingface models all use cpu.

nanafy avatar Nov 30 '23 06:11 nanafy

Switching back to my huggingface models all use cpu.

The Vulkan backend only supports Q4_0 and Q4_1 quantizations currently, and Q4_1 is not recommended for LLaMA-2 based models.

cebtenzzre avatar Dec 05 '23 18:12 cebtenzzre