UI-TARS-desktop icon indicating copy to clipboard operation
UI-TARS-desktop copied to clipboard

[Bug]: At most 1 image(s) may be provided in one request; 400 Bad Request

Open Chenky424 opened this issue 8 months ago • 3 comments

Version

Agent-TARS-v1.0.0-alpha.7

Model

UI-TARS-7B-DPO

Deployment Method

Cloud

Issue Description

it just can carry out one-step demand, like 'open Chrome'. After finishing this, it always screenshotted again and then reported an error. Sometimes, it could not find the right location.

Error Logs

ERROR 04-09 17:23:46 [serving_chat.py:198] Error in preprocessing prompt inputs ERROR 04-09 17:23:46 [serving_chat.py:198] Traceback (most recent call last): ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/openai/serving_chat.py", line 181, in create_chat_completion ERROR 04-09 17:23:46 [serving_chat.py:198] ) = await self._preprocess_chat( ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/openai/serving_engine.py", line 391, in _preprocess_chat ERROR 04-09 17:23:46 [serving_chat.py:198] conversation, mm_data_future = parse_chat_messages_futures( ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 1139, in parse_chat_messages_futures ERROR 04-09 17:23:46 [serving_chat.py:198] sub_messages = _parse_chat_message_content( ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 1067, in _parse_chat_message_content ERROR 04-09 17:23:46 [serving_chat.py:198] result = _parse_chat_message_content_parts( ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 967, in _parse_chat_message_content_parts ERROR 04-09 17:23:46 [serving_chat.py:198] parse_res = _parse_chat_message_content_part( ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 1024, in _parse_chat_message_content_part ERROR 04-09 17:23:46 [serving_chat.py:198] mm_parser.parse_image(str_content) ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 725, in parse_image ERROR 04-09 17:23:46 [serving_chat.py:198] placeholder = self._tracker.add("image", image_coro) ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 548, in add ERROR 04-09 17:23:46 [serving_chat.py:198] raise ValueError( ERROR 04-09 17:23:46 [serving_chat.py:198] ValueError: At most 1 image(s) may be provided in one request. INFO: 127.0.0.1:32910 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request

Chenky424 avatar Apr 09 '25 09:04 Chenky424

+1 same issue

yytdfc avatar Apr 10 '25 09:04 yytdfc

Thank you for your detailed feedback and analysis.

The UI-TARS-7B-DPO model is currently designed to work only with UI-TARS Desktop and supports GUI Agent-related capabilities. It does not yet support integration with Agent-TARS, which is likely causing the issues you observed.

Some clarifications

These are two applications:

  • UI TARS Desktop is our first GUI Agent application focused on controlling computers, with its latest version being v0.0.8, which supports both Mac and Windows.

  • Agent TARS App is our new application focused on browser agents, with its latest version being Agent-TARS-v1.0.0-alpha.7. Since it is still in the technical preview stage, it currently supports only Mac.

Suggestions

  1. Use a model compatible with Agent-TARS: https://github.com/bytedance/UI-TARS-desktop/discussions/377

  2. Future Development: Long-term, we plan to make UI-TARS models, integrate with Agent-TARS. This is under technical research, so stay tuned for updates.

ulivz avatar Apr 10 '25 16:04 ulivz

UI-TARS-1.5-7B same issue

zhurunhua avatar May 30 '25 03:05 zhurunhua