Running Siglip/Siglip2 on MLX?
Hi, just wondering if there is support for running Siglip models on MLX? This PR seems to indicate so https://github.com/Blaizzy/mlx-vlm/pull/24
Yes, we do support it.
But I believe you want just SigLip, right?
Correct, SigLIP2 would be the one. Any resource(s) you could point me towards for converting models and running them on mlx?
@Blaizzy guessing we could figure it out from here mlx_vlm/models/paligemma/vision.py , will the conversion scripts in the comments work for SigLIP2?
from transformers import AutoModelForCausalLM, AutoProcessor
model_id= "<huggingface_model_id>"
model = AutoModelForCausalLM.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id)
model.save_pretrained("<local_dir>")
processor.save_pretrained("<local_dir>")
```
Then use the <local_dir> as the --hf-path in the convert script.
```
python -m mlx_vlm.convert --hf-path <local_dir> --mlx-path <mlx_dir>
I saw some other GH issue where someone had problems converting SigLIP models (can't find the link now). Any guidance would be hugely appreciated - is it worth digging into this, i.e. how much work is involved, and how much of a performance boost could we expect from running on MLX vs MPS in PyTorch?
Found the issue where people had issues converting SigLIP models to MLX: https://github.com/ml-explore/mlx-examples/issues/747
I added Siglip support here:
https://github.com/Blaizzy/mlx-embeddings/releases/tag/v0.0.2
Siglip2 is next.
Amazing! Will be very interesting to see what kind of speedups we can get compared to MPS/PyTorch. Been playing around with the mexma-siglip2 variant recently which has shown very good performance. Do you think it could be included when support for the other Siglip2 variants are implemented? https://huggingface.co/visheratin/mexma-siglip2
You will be able to use my siglip2 implementation and add the changes needed to support mexma.