Add Support for OS-Copilot/OS-Atlas-Base-7B
@Blaizzy Can you help me run OS-Copilot/OS-Atlas-Base-7B, i tried converting to mlx, 8bit but unable to get the same accuracy as the original model here https://huggingface.co/spaces/maxiw/OS-ATLAS
What could i be doing wrong?
used command mlx_vlm.convert --hf-path OS-Copilot/OS-Atlas-Base-7B -q --q-bits 8
more details here https://github.com/OS-Copilot/OS-Atlas/issues/51
Hey,
I just tried it.
It works well on demo samples but fails with custom UIs
Check the screen resolution they are using and the prompting strategy
i checked that, they're resizing, but still that doesn't helps me @Blaizzy any guidance you could provide with?
also it fails on the same https://maxiw-os-atlas.hf.space/gradio_api/file=/tmp/gradio/fc0a8ef05b952a970924913eea2b89a8d626c92031f94ff5f3e6ccaf8dd23a4e/web_6f93090a-81f6-489e-bb35-1a2838b18c01.png attached here The y index specifically
Qwen2vl needs to normalise their bbox to 1000
i've been doing that @Blaizzy i do get the x axis correct, just the y axis is wrong
also when you say you've tried on the demo samples here https://github.com/Blaizzy/mlx-vlm/issues/229#issuecomment-2706806607, did you mean you've used the original hugging face model, or a converted via mlx one?
this is from hugging face spaces logs
vs
this is using mlx
can you see if you can help by any chance?
btw, i've kept the original size of image on huggingface spaces, modified the code to keep the original image you can see it in the logs
@Blaizzy any directions you could help me with?
same happens for https://huggingface.co/osunlp/UGround-V1-7B this too, maybe i m missing out something?
@Blaizzy any pointers?
@Blaizzy i've been struck forever on this, any direction you could help me with?
Hey Prince
I have my plate full,
I have given you all the pointers you need. This is a great opportunity for you to learn.
From the images you shown above the bouding boxes between HF and MLX have a high IoU (Intersection over Union) of more than 60% which is good.
I'm closing this issue for now.
Please feel free to re-open if and when you found a bug/problem with MLX that is reproducible.
just one thing, the IOU, is this expected between HF n MLX? @Blaizzy
https://github.com/huggingface/transformers/blob/87b30c35892568f9b83d4e8d1233956b8e0cd96c/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py#L1708 i believe we're not calculating ROPE index's in MLX-VLM, which is causing the problem, once i comment out this section in transforms, i get the same issue there as well @Blaizzy BTW
@Blaizzy can you re-open this issue, there seems to be issue as stated above. Im trying on work on the patch meanwhile
@Blaizzy also can you review the issue is correct ?
@Blaizzy https://github.com/Blaizzy/mlx-vlm/pull/319 can you review this PR this has been handled now