llama-cpp-python feat: Add Gemma3 chat handler (#1976)

feat: Add Gemma3 chat handler (#1976)

Open kossum opened this issue 8 months ago • 25 comments

Added gemma3 chat handler, and fixed the image embedding, supports multiple images.

Included llamacpp functions and structures:

clip_image_load_from_bytes
clip_image_batch_encode
clip_image_preprocess
clip_image_f32_batch_init
clip_image_f32_batch_free
clip_image_u8_init
clip_image_u8_free

Usage (Current version, after Apr 4 2025):

from llama_cpp import Llama
from llama_cpp.llama_chat_format import Gemma3ChatHandler

chat_handler = Gemma3ChatHandler(clip_model_path="path/to/mmproj")
llama = Llama(
  model_path="path/to/model",
  chat_handler=chat_handler,
  n_ctx=1024,  # n_ctx should be increased to accommodate the image embedding
)

messages = [
  {
    'role': 'user',
    'content': [
      {'type': 'text', 'text': 'Please describe this image'},
      {'type': 'image_url', 'image_url': 'https://raw.githubusercontent.com/huggingface/transformers/refs/heads/main/tests/fixtures/tests_samples/COCO/000000039769.png'},
    ]
  }
]

output = llama.create_chat_completion(
  messages,
  stop=['<end_of_turn>', '<eos>'],
  max_tokens=200,
)

print(output['choices'][0]['message']['content'])

- {'type': 'image', 'image': ...}
+ {'type': 'image_url', 'image_url': ...}

Test Results:

Passed local environment tests: Python 3.12, unsloth/gemma-3-4b-it-GGUF, unsloth/gemma-3-12b-it-GGUF, unsloth/gemma-3-27b-it-GGUF, bartowski/google_gemma-3-12b-it-GGUF

Compatibility:

Fully backward compatible with existing interfaces.
Maintains original APIs while adding new options and interfaces.

Mar 30 '25 19:03 kossum

llama-cpp-python llama-cpp-python copied to clipboard

feat: Add Gemma3 chat handler (#1976)

llama-cpp-python
llama-cpp-python copied to clipboard