David Koski
David Koski
@DePasqualeOrg ^^^ not sure if this notified you -- we are missing some code that lives in transformers.
> Got it. Do you want to take on that part? I don't know if I'll be able to add anything else today. Maybe -- I will post here when/if...
My thoughts on debugging (I have run into similar things with a few models I have ported): - we have a working python version, we can compare to that -...
Ah, that is unfortunate -- I wonder if the python code is set up to call it the same way without an image? Anyway, I made a red image to...
I copied the tokens & mask array from the python version into swift and got the same garbled output. So probably not the tokenizing, but there _are_ differences. It looks...
> Gemma 3 4B is working with images, although the output quality quickly degrades in a multi-turn conversation. > > When I try to load Gemma 3 12B, I get...
Here is the predicate for quantization from mlx_vlm: ```python def get_class_predicate(skip_vision, weights=None): if skip_vision: return lambda p, m: hasattr(m, "to_quantized") and not ( "vision_model" in p or "vision_tower" in p...
The template issue is `{{message['role'] | capitalize}}` (working) vs `{{message['role'].capitalize()}}` (not working)
No longer gets the Jinja error with current `main` but 90% of the time it fails to describe the image correctly. Oddly, roughly 10% of the time it works fine.
Which is probably #318 -- closing this in favor of that as the original issue seems to be resolved.