UI-TARS-1.5-7B element locator accuracy significantly degraded
When using the previous version of the model, UI-TARS-7B, for Web UI automation, element location was relatively accurate and stable. However, after upgrading to UI-TARS-1.5-7B, the accuracy of element positioning has significantly decreased. As a result, many of our existing automation test cases are no longer executable or fail frequently.
We have updated the tutorial on coordinate processing.
Thanks for your reply, I’ll give it a try later.
@NEOOOOOOOOOO , do you know how to deploy UI-TARS-1.5-7B locally using vLLM? there is no documentation about local deployment. thanks
@XingWang1234
Install vllm via pip install vllm and start the model with Python:
CUDA_VISIBLE_DEVICES=0,1 nohup python -m vllm.entrypoints.openai.api_server \
--served-model-name ui-tars \
--model bytedance-research/UI-TARS-1.5-7B \
--port 10006 \
--tensor-parallel-size 2 \
--gpu-memory-utilization 0.85 \
--limit-mm-per-prompt "image=5" \
--max-model-len 20000 \
--disable-custom-all-reduce > ui-tars.log 2>&1 &
@NEOOOOOOOOOO thank you so much.