llama-cpp-python icon indicating copy to clipboard operation
llama-cpp-python copied to clipboard

ValueError: Failed to create llama_context

Open Isaakkamau opened this issue 11 months ago • 5 comments

(yuna2) (base) adm@Adms-MacBook-Pro yuna-ai % python index.py ggml_metal_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:3:10: fatal error: 'ggml-common.h' file not found #include "ggml-common.h" ^~~~~~~~~~~~~~~ " UserInfo={NSLocalizedDescription=program_source:3:10: fatal error: 'ggml-common.h' file not found #include "ggml-common.h" ^~~~~~~~~~~~~~~ } llama_new_context_with_model: failed to initialize Metal backend Traceback (most recent call last): File "/Users/adm/Desktop/yuna-ai/index.py", line 171, in yuna_server = YunaServer() ^^^^^^^^^^^^ File "/Users/adm/Desktop/yuna-ai/index.py", line 33, in init self.chat_generator = ChatGenerator(self.config) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/adm/Desktop/yuna-ai/lib/generate.py", line 11, in init self.model = Llama( ^^^^^^ File "/usr/local/lib/python3.12/site-packages/llama_cpp/llama.py", line 328, in init self._ctx = _LlamaContext( ^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/llama_cpp/_internals.py", line 265, in init raise ValueError("Failed to create llama_context") ValueError: Failed to create llama_context Screen Shot 2024-03-25 at 23 43 18

Isaakkamau avatar Mar 25 '24 20:03 Isaakkamau

Maybe you need to reinstall llama-cpp-python with the following command:

CMAKE_ARGS="-DLLAMA_METAL_EMBED_LIBRARY=ON -DLLAMA_METAL=on" pip3 install -U --force-reinstall llama-cpp-python --no-cache-dir

Answer from: https://github.com/abetlen/llama-cpp-python/issues/1285#issuecomment-2007778703

JackyCCK2126 avatar Mar 26 '24 15:03 JackyCCK2126

Hey, @JackyCCK2126! I'm having the same issue, and now it works! Thanks, but what was the problem with?

Also, is there any workaround to speed up the generation on the M1?

yukiarimo avatar Mar 28 '24 02:03 yukiarimo

@yukiarimo I don't know much about M1. But in general, you can offload more layers in GPU and lower the context size when initializing the LLama class by setting n_gpt_layers and n_ctx. (top_p and top_k may also affect a bit of speed) If it is still too slow, you can choose a smaller model.

However, if your prompt is not too long, it should have around 7 to 12 tokens per second, which is somehow acceptable for me.

JackyCCK2126 avatar Mar 28 '24 05:03 JackyCCK2126

@yukiarimo If you found a speed-up solution, please let me know. XD

JackyCCK2126 avatar Mar 28 '24 05:03 JackyCCK2126

Maybe you need to reinstall llama-cpp-python with the following command:

CMAKE_ARGS="-DLLAMA_METAL_EMBED_LIBRARY=ON -DLLAMA_METAL=on" pip3 install -U --force-reinstall llama-cpp-python --no-cache-dir

Answer from: #1285 (comment)

Doesn't seem to solve it for me... Do you happen to know if I'm missing something?

That's the end of the traceback:

Screen Shot 2024-04-17 at 23 45 45

Spider-netizen avatar Apr 17 '24 21:04 Spider-netizen

Maybe you need to reinstall llama-cpp-python with the following command:

CMAKE_ARGS="-DLLAMA_METAL_EMBED_LIBRARY=ON -DLLAMA_METAL=on" pip3 install -U --force-reinstall llama-cpp-python --no-cache-dir

Answer from: #1285 (comment)

Doesn't seem to solve it for me... Do you happen to know if I'm missing something?

That's the end of the traceback:

Screen Shot 2024-04-17 at 23 45 45

I have the same result too. It failed to create llama_context. I was wondering to why need to set DLLAMA_METAL=on? I think METAL is for Macbook but I was running llama.cpp on Windows PC.

ScofieldYeh avatar Jun 02 '24 09:06 ScofieldYeh

Maybe you need to reinstall llama-cpp-python with the following command:

CMAKE_ARGS="-DLLAMA_METAL_EMBED_LIBRARY=ON -DLLAMA_METAL=on" pip3 install -U --force-reinstall llama-cpp-python --no-cache-dir

Answer from: #1285 (comment)

Doesn't seem to solve it for me... Do you happen to know if I'm missing something? That's the end of the traceback: Screen Shot 2024-04-17 at 23 45 45

I have the same result too. It failed to create llama_context. I was wondering to why need to set DLLAMA_METAL=on? I think METAL is for Macbook but I was running llama.cpp on Windows PC.

Yes. Metal is only for Apple's products.

JackyCCK2126 avatar Jun 02 '24 10:06 JackyCCK2126