mlx-vlm icon indicating copy to clipboard operation
mlx-vlm copied to clipboard

Running Siglip/Siglip2 on MLX?

Open maxlund opened this issue 9 months ago • 7 comments

Hi, just wondering if there is support for running Siglip models on MLX? This PR seems to indicate so https://github.com/Blaizzy/mlx-vlm/pull/24

maxlund avatar Feb 27 '25 15:02 maxlund

Yes, we do support it.

But I believe you want just SigLip, right?

Blaizzy avatar Mar 03 '25 15:03 Blaizzy

Correct, SigLIP2 would be the one. Any resource(s) you could point me towards for converting models and running them on mlx?

maxlund avatar Mar 03 '25 17:03 maxlund

@Blaizzy guessing we could figure it out from here mlx_vlm/models/paligemma/vision.py , will the conversion scripts in the comments work for SigLIP2?

from transformers import AutoModelForCausalLM, AutoProcessor

model_id= "<huggingface_model_id>"
model = AutoModelForCausalLM.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id)

model.save_pretrained("<local_dir>")
processor.save_pretrained("<local_dir>")
```
Then use the <local_dir> as the --hf-path in the convert script.
```
python -m mlx_vlm.convert --hf-path <local_dir> --mlx-path <mlx_dir>

I saw some other GH issue where someone had problems converting SigLIP models (can't find the link now). Any guidance would be hugely appreciated - is it worth digging into this, i.e. how much work is involved, and how much of a performance boost could we expect from running on MLX vs MPS in PyTorch?

maxlund avatar Mar 12 '25 01:03 maxlund

Found the issue where people had issues converting SigLIP models to MLX: https://github.com/ml-explore/mlx-examples/issues/747

maxlund avatar Mar 12 '25 16:03 maxlund

I added Siglip support here:

https://github.com/Blaizzy/mlx-embeddings/releases/tag/v0.0.2

Siglip2 is next.

Blaizzy avatar Mar 29 '25 18:03 Blaizzy

Amazing! Will be very interesting to see what kind of speedups we can get compared to MPS/PyTorch. Been playing around with the mexma-siglip2 variant recently which has shown very good performance. Do you think it could be included when support for the other Siglip2 variants are implemented? https://huggingface.co/visheratin/mexma-siglip2

maxlund avatar Mar 30 '25 00:03 maxlund

You will be able to use my siglip2 implementation and add the changes needed to support mexma.

Blaizzy avatar Mar 30 '25 00:03 Blaizzy