llama.cpp I am trying to load vicunlocked-65b.ggmlv3.q2_K.bin but nothing happening just quit without any error.

I am trying to load vicunlocked-65b.ggmlv3.q2_K.bin but nothing happening just quit without any error. The newest build master-5c64a09](https://github.com/ggerganov/llama.cpp/releases/tag/master-5c64a09

PS F:\LLAMA\llama.cpp> build\bin\main --model models/new2/vicunlocked-65b.ggmlv3.q2_K.bin --mlock --color --threads 8 --keep -1 --batch_size 512 --n_predict -1 --top_k 10000 --top_p 0.9 --temp 0.96 --repeat_penalty 1.1 --ctx_size 2048 --interactive --instruct --reverse-prompt "### Human:" --reverse-prompt "### User:" --reverse-prompt "### Assistant:" -ngl 36 main: build = 635 (5c64a09) main: seed = 1686149787 ggml_init_cublas: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 3090 llama.cpp: loading model from models/new2/vicunlocked-65b.ggmlv3.q2_K.bin PS F:\LLAMA\llama.cpp>

Jun 07 '23 14:06 mirek190

I have the same problem here. On newest build master-5c64a09]

F:\llamacpp-k>title Wizard-Vicuna-30B-4ks

F:\llamacpp-k>main --mlock --instruct -i --interactive-first --top_k 60 --top_p 1.1 -c 2048 --color --temp 0.8 -n -1 --keep -1 --repeat_penalty 1.1 -t 6 -m Wizard-Vicuna-30B-Uncensored-ggmlv3-q4_K_S.bin --gpu-layers 20 main: build = 635 (5c64a09) main: seed = 1686151371 ggml_init_cublas: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 2060 SUPER llama.cpp: loading model from Wizard-Vicuna-30B-Uncensored-ggmlv3-q4_K_S.bin

F:\llamacpp-k>pause Press any key to continue . . .

Jun 07 '23 15:06 MoodCharge

I had the same issue - a temporary fix is to revert the newest release [master-5c64a09], and instead use [master-35a8491]. That seems to work :)

Jun 07 '23 19:06 macbrine

...yeah works ... previous build ... try to add any GPU layer and you will get ONLY gibberish...

Jun 07 '23 23:06 mirek190

This is fixed with later releases

Jun 17 '23 11:06 MoodCharge

This issue was closed because it has been inactive for 14 days since being marked as stale.

Apr 10 '24 01:04 github-actions[bot]