llama-cpp-python icon indicating copy to clipboard operation
llama-cpp-python copied to clipboard

feat: Add Gemma3 chat handler (#1976)

Open kossum opened this issue 8 months ago • 25 comments

Added gemma3 chat handler, and fixed the image embedding, supports multiple images.

Included llamacpp functions and structures:

  • clip_image_load_from_bytes
  • clip_image_batch_encode
  • clip_image_preprocess
  • clip_image_f32_batch_init
  • clip_image_f32_batch_free
  • clip_image_u8_init
  • clip_image_u8_free

Usage (Current version, after Apr 4 2025):

from llama_cpp import Llama
from llama_cpp.llama_chat_format import Gemma3ChatHandler

chat_handler = Gemma3ChatHandler(clip_model_path="path/to/mmproj")
llama = Llama(
  model_path="path/to/model",
  chat_handler=chat_handler,
  n_ctx=1024,  # n_ctx should be increased to accommodate the image embedding
)

messages = [
  {
    'role': 'user',
    'content': [
      {'type': 'text', 'text': 'Please describe this image'},
      {'type': 'image_url', 'image_url': 'https://raw.githubusercontent.com/huggingface/transformers/refs/heads/main/tests/fixtures/tests_samples/COCO/000000039769.png'},
    ]
  }
]

output = llama.create_chat_completion(
  messages,
  stop=['<end_of_turn>', '<eos>'],
  max_tokens=200,
)

print(output['choices'][0]['message']['content'])
- {'type': 'image', 'image': ...}
+ {'type': 'image_url', 'image_url': ...}

Test Results:

Compatibility:

  • Fully backward compatible with existing interfaces.
  • Maintains original APIs while adding new options and interfaces.

kossum avatar Mar 30 '25 19:03 kossum