llama-cpp-python
llama-cpp-python copied to clipboard
Intel Mac (i9 - 5500M) (macOS 15.3.2) - ValueError: Failed to create llama_context - llama_init_from_model: failed to initialize Metal backend
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [x] I carefully followed the README.md.
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
I am trying to run a local LLM:
`from llama_cpp import Llama
llm = Llama(model_path="/Users/mfzainulabideen/Downloads/Llama/Llama-3.2-3B-Instruct/Llama-3.2-3B-Instruct-F16.gguf")`
It should execute the code and complete the cell in ipynb so that I can figure out what I plan to do next with the LLM.
Current Behavior
The code doesn't run and throws an error.
Environment and Context
I am trying to execute this on a 2019 16" MBP with a 2.4GHz i9, 32GB RAM and a 1TB SSD running on macOS 15.3.2.
I am using the latest miniconda env with Python 3.11.11.
GNU Make 3.81 built for i386-apple-darwin11.3.0.
Apple clang version 16.0.0 (clang-1600.0.26.6)
Target: x86_64-apple-darwin24.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Failure Information (for bugs)
Python throws the following error:
ValueError Traceback (most recent call last)
Cell In[2], line 1
----> 1 llm = Llama(model_path="/Users/mfzainulabideen/Downloads/Llama/Llama-3.2-3B-Instruct/Llama-3.2-3B-Instruct-F16.gguf")
File /opt/miniconda3/envs/test/lib/python3.11/site-packages/llama_cpp/llama.py:393, in Llama.__init__(self, model_path, n_gpu_layers, split_mode, main_gpu, tensor_split, rpc_servers, vocab_only, use_mmap, use_mlock, kv_overrides, seed, n_ctx, n_batch, n_ubatch, n_threads, n_threads_batch, rope_scaling_type, pooling_type, rope_freq_base, rope_freq_scale, yarn_ext_factor, yarn_attn_factor, yarn_beta_fast, yarn_beta_slow, yarn_orig_ctx, logits_all, embedding, offload_kqv, flash_attn, no_perf, last_n_tokens_size, lora_base, lora_scale, lora_path, numa, chat_format, chat_handler, draft_model, tokenizer, type_k, type_v, spm_infill, verbose, **kwargs)
388 self.context_params.n_batch = self.n_batch
389 self.context_params.n_ubatch = min(self.n_batch, n_ubatch)
391 self._ctx = self._stack.enter_context(
392 contextlib.closing(
--> 393 internals.LlamaContext(
394 model=self._model,
395 params=self.context_params,
396 verbose=self.verbose,
397 )
398 )
399 )
401 self._batch = self._stack.enter_context(
402 contextlib.closing(
403 internals.LlamaBatch(
(...) 409 )
410 )
412 self._lora_adapter: Optional[llama_cpp.llama_adapter_lora_p] = None
File /opt/miniconda3/envs/test/lib/python3.11/site-packages/llama_cpp/_internals.py:255, in LlamaContext.__init__(self, model, params, verbose)
252 ctx = llama_cpp.llama_new_context_with_model(self.model.model, self.params)
254 if ctx is None:
--> 255 raise ValueError("Failed to create llama_context")
257 self.ctx = ctx
259 def free_ctx():
ValueError: Failed to create llama_context
Steps to Reproduce
conda create -n llm python=3.11 -y && conda activate llm
pip install jupyter
pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir --verbose
then in a notebook, with the llm env as the kernel, run the following code:
from llama_cpp import Llama
llm = Llama(model_path="/Users/mfzainulabideen/Downloads/Llama/Llama-3.2-3B-Instruct/Llama-3.2-3B-Instruct-F16.gguf")
Obviously change the model_path.
Failure Logs
PFA the verbose log upto the point of failure: llama-cpp-python-verbose-log.txt
My environment info:
llama-cpp-python$ git log | head -1
commit 37eb5f0a4c2a8706b89ead1406b1577c4602cdec
llama-cpp-python$ python3 --version
Python 3.11.11
llama-cpp-python$ pip list | egrep "uvicorn|fastapi|sse-starlette|numpy"
numpy 2.2.4
llama-cpp-python/vendor/llama.cpp$ git log | head -3
commit 37eb5f0a4c2a8706b89ead1406b1577c4602cdec
Author: Andrei Betlen <[email protected]>
Date: Wed Mar 12 05:30:21 2025 -0400