paligemma topics

notebooks

9.0k

Stars

1.4k

Forks

9.0k

Watchers

A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM 2,...

roboflow

computer-vision

deep-learning

deep-neural-networks

image-classification

maestro

2.6k

Stars

219

Forks

2.6k

Watchers

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

roboflow

cross-modal

gpt-4

gpt-4-vision

instance-segmentation

ms-swift

3.6k

Stars

310

Forks

Watchers

Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Visio...

modelscope

agent

aigc

baichuan

chatglm

mlx-vlm

1.9k

Stars

212

Forks

1.9k

Watchers

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

Blaizzy

llava

llm

mlx

vision-transformer

YoloGemma

84

Stars

6

Forks

84

Watchers

Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detection and segmentation.

adithya-s-k

gemma

paligemma

vlm