llama-cpp-python unknown model architecture: 'gemma-embedding'

unknown model architecture: 'gemma-embedding'

Open mariocannistra opened this issue 3 months ago • 3 comments

I am running llama-cpp-python Version: 0.3.16

Trying to load the recently released model embeddinggemma-300M I get the following error message:

llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'gemma-embedding'

Support for this model architecture has been added to llama.cpp in this build: https://github.com/ggml-org/llama.cpp/releases/tag/b6384

Could you please align llama-cpp-python to reflect this addition ?

Sep 05 '25 09:09 mariocannistra

I tried to just dump in the latest llama.cpp under vendor and build, resulted in this error when trying to load the model:

.local/lib/python3.13/site-packages/llama_cpp/llama_cpp.py", line 1408, in @ctypes_function( ~~~~~~~~~~~~~~~^ "llama_get_kv_self", ^^^^^^^^^^^^^^^^^^^^ [llama_context_p_ctypes], ^^^^^^^^^^^^^^^^^^^^^^^^^ llama_kv_cache_p_ctypes, ^^^^^^^^^^^^^^^^^^^^^^^^ ) ^

Sep 10 '25 22:09 jballn

So... nothing being done on this? I have same issue as OP.

Oct 01 '25 09:10 max38tech

The error is because support for gemma embedding was added to the original llama.cpp library recently, and the llama-cpp-python binding is not up-to-date with the original one.

Oct 08 '25 07:10 atharva-again

Blessings, any update?

Nov 21 '25 20:11 diegocaumont

llama-cpp-python llama-cpp-python copied to clipboard

unknown model architecture: 'gemma-embedding'

llama-cpp-python
llama-cpp-python copied to clipboard