I am trying to load vicunlocked-65b.ggmlv3.q2_K.bin but nothing happening just quit without any error.
I am trying to load vicunlocked-65b.ggmlv3.q2_K.bin but nothing happening just quit without any error. The newest build master-5c64a09](https://github.com/ggerganov/llama.cpp/releases/tag/master-5c64a09
PS F:\LLAMA\llama.cpp> build\bin\main --model models/new2/vicunlocked-65b.ggmlv3.q2_K.bin --mlock --color --threads 8 --keep -1 --batch_size 512 --n_predict -1 --top_k 10000 --top_p 0.9 --temp 0.96 --repeat_penalty 1.1 --ctx_size 2048 --interactive --instruct --reverse-prompt "### Human:" --reverse-prompt "### User:" --reverse-prompt "### Assistant:" -ngl 36 main: build = 635 (5c64a09) main: seed = 1686149787 ggml_init_cublas: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 3090 llama.cpp: loading model from models/new2/vicunlocked-65b.ggmlv3.q2_K.bin PS F:\LLAMA\llama.cpp>
I have the same problem here. On newest build master-5c64a09]
F:\llamacpp-k>title Wizard-Vicuna-30B-4ks
F:\llamacpp-k>main --mlock --instruct -i --interactive-first --top_k 60 --top_p 1.1 -c 2048 --color --temp 0.8 -n -1 --keep -1 --repeat_penalty 1.1 -t 6 -m Wizard-Vicuna-30B-Uncensored-ggmlv3-q4_K_S.bin --gpu-layers 20 main: build = 635 (5c64a09) main: seed = 1686151371 ggml_init_cublas: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 2060 SUPER llama.cpp: loading model from Wizard-Vicuna-30B-Uncensored-ggmlv3-q4_K_S.bin
F:\llamacpp-k>pause Press any key to continue . . .
I had the same issue - a temporary fix is to revert the newest release [master-5c64a09], and instead use [master-35a8491]. That seems to work :)
...yeah works ... previous build ... try to add any GPU layer and you will get ONLY gibberish...
This is fixed with later releases
This issue was closed because it has been inactive for 14 days since being marked as stale.