mlx-vlm icon indicating copy to clipboard operation
mlx-vlm copied to clipboard

Add Support for OS-Copilot/OS-Atlas-Base-7B

Open prncvrm opened this issue 11 months ago • 19 comments

@Blaizzy Can you help me run OS-Copilot/OS-Atlas-Base-7B, i tried converting to mlx, 8bit but unable to get the same accuracy as the original model here https://huggingface.co/spaces/maxiw/OS-ATLAS What could i be doing wrong? used command mlx_vlm.convert --hf-path OS-Copilot/OS-Atlas-Base-7B -q --q-bits 8

prncvrm avatar Mar 06 '25 20:03 prncvrm

more details here https://github.com/OS-Copilot/OS-Atlas/issues/51

prncvrm avatar Mar 07 '25 14:03 prncvrm

Hey,

I just tried it.

It works well on demo samples but fails with custom UIs

Check the screen resolution they are using and the prompting strategy

Blaizzy avatar Mar 07 '25 15:03 Blaizzy

i checked that, they're resizing, but still that doesn't helps me @Blaizzy any guidance you could provide with?

also it fails on the same https://maxiw-os-atlas.hf.space/gradio_api/file=/tmp/gradio/fc0a8ef05b952a970924913eea2b89a8d626c92031f94ff5f3e6ccaf8dd23a4e/web_6f93090a-81f6-489e-bb35-1a2838b18c01.png attached here The y index specifically

prncvrm avatar Mar 08 '25 06:03 prncvrm

Qwen2vl needs to normalise their bbox to 1000

Blaizzy avatar Mar 08 '25 06:03 Blaizzy

i've been doing that @Blaizzy i do get the x axis correct, just the y axis is wrong

prncvrm avatar Mar 08 '25 15:03 prncvrm

also when you say you've tried on the demo samples here https://github.com/Blaizzy/mlx-vlm/issues/229#issuecomment-2706806607, did you mean you've used the original hugging face model, or a converted via mlx one?

prncvrm avatar Mar 08 '25 17:03 prncvrm

Image this is from hugging face spaces logs vs

Imagethis is using mlx

can you see if you can help by any chance?

prncvrm avatar Mar 08 '25 17:03 prncvrm

btw, i've kept the original size of image on huggingface spaces, modified the code to keep the original image you can see it in the logs

prncvrm avatar Mar 08 '25 17:03 prncvrm

@Blaizzy any directions you could help me with?

prncvrm avatar Mar 10 '25 08:03 prncvrm

same happens for https://huggingface.co/osunlp/UGround-V1-7B this too, maybe i m missing out something?

prncvrm avatar Mar 10 '25 10:03 prncvrm

@Blaizzy any pointers?

prncvrm avatar Mar 11 '25 08:03 prncvrm

@Blaizzy i've been struck forever on this, any direction you could help me with?

prncvrm avatar Mar 12 '25 10:03 prncvrm

Hey Prince

I have my plate full,

I have given you all the pointers you need. This is a great opportunity for you to learn.

From the images you shown above the bouding boxes between HF and MLX have a high IoU (Intersection over Union) of more than 60% which is good.

Blaizzy avatar Mar 12 '25 11:03 Blaizzy

I'm closing this issue for now.

Please feel free to re-open if and when you found a bug/problem with MLX that is reproducible.

Blaizzy avatar Mar 12 '25 11:03 Blaizzy

just one thing, the IOU, is this expected between HF n MLX? @Blaizzy

prncvrm avatar Mar 13 '25 05:03 prncvrm

https://github.com/huggingface/transformers/blob/87b30c35892568f9b83d4e8d1233956b8e0cd96c/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py#L1708 i believe we're not calculating ROPE index's in MLX-VLM, which is causing the problem, once i comment out this section in transforms, i get the same issue there as well @Blaizzy BTW

prncvrm avatar Mar 13 '25 11:03 prncvrm

@Blaizzy can you re-open this issue, there seems to be issue as stated above. Im trying on work on the patch meanwhile

prncvrm avatar Mar 17 '25 11:03 prncvrm

@Blaizzy also can you review the issue is correct ?

prncvrm avatar Mar 18 '25 06:03 prncvrm

@Blaizzy https://github.com/Blaizzy/mlx-vlm/pull/319 can you review this PR this has been handled now

prncvrm avatar Apr 21 '25 14:04 prncvrm