llama-cpp-python icon indicating copy to clipboard operation
llama-cpp-python copied to clipboard

Llama 4 not working

Open Kenshiro-28 opened this issue 8 months ago • 7 comments

llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'llama4' llama_model_load_from_file_impl: failed to load model

Please update to a newer version of llama.cpp:

https://github.com/ggml-org/llama.cpp/releases/tag/b5074

Kenshiro-28 avatar Apr 08 '25 14:04 Kenshiro-28

My fork project has added some updates of llama4: https://github.com/JamePeng/llama-cpp-python

JamePeng avatar Apr 08 '25 14:04 JamePeng

Same issue, how to run llama4?

kerlion avatar Apr 14 '25 09:04 kerlion

@kerlion What version of llama-cpp-python are you using? Can you also give me some inside about your platform (OS, etc).

AleefBilal avatar Apr 17 '25 07:04 AleefBilal

@kerlion What version of llama-cpp-python are you using? Can you also give me some inside about your platform (OS, etc).

image: nvidia/cuda:12.2.0-runtime-ubuntu22.04 llama_cpp_python 0.3.8

kerlion avatar Apr 17 '25 08:04 kerlion

I compiled it from the source code, passed this error. But I do not know which "chat_format" to use? Llama-4-Scout-17B-16E-Instruct-UD-Q2_K_XL

kerlion avatar Apr 17 '25 08:04 kerlion

@kerlion Great job on compiling it from source. Below is the command that might save you from the struggle of source compiling. CMAKE_ARGS="-DGGML_CUDA=ON -DLLAMA_LLAVA=OFF" pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir Furthermore, i wasn't able to quite understand your message about using which "chat_format", can you please elaborate.

AleefBilal avatar Apr 17 '25 09:04 AleefBilal

same error with llama_cpp_python 0.3.8:

print_info: file format = GGUF V3 (latest) print_info: file type = Q4_K - Medium print_info: file size = 62.90 GiB (5.01 BPW) llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'llama4' llama_model_load_from_file_impl: failed to load model

h-haghpanah avatar Apr 20 '25 07:04 h-haghpanah

My fork project has added some updates of llama4: https://github.com/JamePeng/llama-cpp-python

Could you please provide your commit number ?

perronemirko avatar May 07 '25 18:05 perronemirko