David Koski
David Koski
I am working on it right now and have paligemma done (well, not debugged but callable). I am working on how to structure the code with regard to the LLM...
OK, you can see what I have -- more work to be done but the eval loop is worked out. #151
This continues -- I have most of the refactoring done and `llm-tool` has a hard coded call to `paligemma`. I need to implement a second VLM (`qwen2_vl`) so I can...
It is implemented in the branch right now but still lacks the image processor -- that is what I am starting on next.
Yes, this first version won't have it but it should be straightforward to add. Qwen2VL treats an array of images and a video roughly the same but handles them slightly...
> btw, @davidkoski is there a way to set up a LLM api on MLX as is done with llama.cpp or tools like LM Studio? I have done this with...
> Thanks for your work on the vlm branch @davidkoski . Using llm-tool i can get paligemma to work with the following flag: --model mlx-community/paligemma-3b-mix-448-8bit > > but i can't...
> Here are the flags im using: `vlm --model mlx-community/Qwen2-VL-2B-Instruct-4bit --prompt "describe image in english" --image /Users/pathtoimage/image.png` > > get this output when trying to use that model: > >...
Closing this -- we have two models (qwen2-vl and paligemma). More can be added over time.
https://github.com/ml-explore/mlx/blob/eaf709b83e559079e212699bfc9dd2f939d25c9a/python/mlx/optimizers/optimizers.py#L157 Sure, it just needs porting