easy-llama icon indicating copy to clipboard operation
easy-llama copied to clipboard

[BUG] llama_model_load_from_file_impl: no backends are loaded

Open iwr-redmond opened this issue 5 months ago • 2 comments

When attempting to load the precompiled build 5866 release files, either CPU or Vulkan, on LinuxMint 21.3 (Ubuntu Jammy, Python 3.10), an error occurs:

[2025-07-31 Thu 09:16:21.876] INFO: easy_llama v0.2.14 targeting llama.cpp@0b885577 (2025-07-10) [2025-07-31 Thu 09:16:21.877] INFO: loaded libllama from /home/redmond/Applications/llamacpp/libllama.so llama_model_load_from_file_impl: no backends are loaded. hint: use ggml_backend_load() or ggml_backend_load_all() to load a backend before calling this function Traceback (most recent call last): File "", line 1, in File "/home/redmond/.local/lib/python3.10/site-packages/easy_llama/llama.py", line 872, in init self._model = _LlamaModel( File "/home/redmond/.local/lib/python3.10/site-packages/easy_llama/llama.py", line 604, in init null_ptr_check(self.model, "self.model", "_LlamaModel.init") File "/home/redmond/.local/lib/python3.10/site-packages/easy_llama/utils.py", line 265, in null_ptr_check raise LlamaNullException(f"{loc_hint}: pointer {ptr_name} is null") easy_llama.utils.LlamaNullException: _LlamaModel.init: pointer self.model is null

Repro:

pip install easy-llama
export LIBLLAMA=/home/redmond/Applications/llamacpp/libllama.so
python
import easy_llama as ez
model="/home/redmond/.cache/gpt4all/Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf"
MyLlama = ez.Llama(model)

iwr-redmond avatar Jul 30 '25 21:07 iwr-redmond

This may be a bug in the way mainline llamacpp is compiled. The CUDA build from NexaAI works as expected.

iwr-redmond avatar Aug 05 '25 18:08 iwr-redmond

Same thing. Vulkan precompiled build b7103 causes

[2025-11-19 Wed 21:33:57.448] INFO: loaded libllama from M:\AIDetector\llama.cpp\llama.dll llama_model_load_from_file_impl: no backends are loaded. hint: use ggml_backend_load() or ggml_backend_load_all() to load a backend before calling this function

Despite of this precompiled llama-bench works fine:

M:\AIDetector\llama.cpp>llama-bench.exe -m ../bartowski\Meta-Llama-3.1-8B-Instruct-GGUF\Meta-Llama-3.1-8B-Instruct-Q6_K.gguf load_backend: loaded RPC backend from M:\AIDetector\llama.cpp\ggml-rpc.dll ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: 0 = AMD Radeon RX 5700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 0 | matrix cores: none load_backend: loaded Vulkan backend from M:\AIDetector\llama.cpp\ggml-vulkan.dll load_backend: loaded CPU backend from M:\AIDetector\llama.cpp\ggml-cpu-haswell.dll

model size params backend ngl test t/s
llama 8B Q6_K 6.14 GiB 8.03 B Vulkan 99 pp512 367.03 + 5.53
llama 8B Q6_K 6.14 GiB 8.03 B Vulkan 99 tg128 49.73 + 0.36

daemonserj avatar Nov 19 '25 14:11 daemonserj