UI-TARS-desktop [Bug]: At most 1 image(s) may be provided in one request; 400 Bad Request

Version

Agent-TARS-v1.0.0-alpha.7

Model

UI-TARS-7B-DPO

Deployment Method

Cloud

Issue Description

it just can carry out one-step demand, like 'open Chrome'. After finishing this, it always screenshotted again and then reported an error. Sometimes, it could not find the right location.

Error Logs

ERROR 04-09 17:23:46 [serving_chat.py:198] Error in preprocessing prompt inputs ERROR 04-09 17:23:46 [serving_chat.py:198] Traceback (most recent call last): ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/openai/serving_chat.py", line 181, in create_chat_completion ERROR 04-09 17:23:46 [serving_chat.py:198] ) = await self._preprocess_chat( ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/openai/serving_engine.py", line 391, in _preprocess_chat ERROR 04-09 17:23:46 [serving_chat.py:198] conversation, mm_data_future = parse_chat_messages_futures( ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 1139, in parse_chat_messages_futures ERROR 04-09 17:23:46 [serving_chat.py:198] sub_messages = _parse_chat_message_content( ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 1067, in _parse_chat_message_content ERROR 04-09 17:23:46 [serving_chat.py:198] result = _parse_chat_message_content_parts( ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 967, in _parse_chat_message_content_parts ERROR 04-09 17:23:46 [serving_chat.py:198] parse_res = _parse_chat_message_content_part( ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 1024, in _parse_chat_message_content_part ERROR 04-09 17:23:46 [serving_chat.py:198] mm_parser.parse_image(str_content) ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 725, in parse_image ERROR 04-09 17:23:46 [serving_chat.py:198] placeholder = self._tracker.add("image", image_coro) ERROR 04-09 17:23:46 [serving_chat.py:198] File "/mnt/speech/ruiyu/miniconda3/envs/tars/lib/python3.10/site-packages/vllm/entrypoints/chat_utils.py", line 548, in add ERROR 04-09 17:23:46 [serving_chat.py:198] raise ValueError( ERROR 04-09 17:23:46 [serving_chat.py:198] ValueError: At most 1 image(s) may be provided in one request. INFO: 127.0.0.1:32910 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request

Apr 09 '25 09:04 Chenky424

+1 same issue

Apr 10 '25 09:04 yytdfc

Thank you for your detailed feedback and analysis.

The UI-TARS-7B-DPO model is currently designed to work only with UI-TARS Desktop and supports GUI Agent-related capabilities. It does not yet support integration with Agent-TARS, which is likely causing the issues you observed.

Some clarifications

These are two applications:

UI TARS Desktop is our first GUI Agent application focused on controlling computers, with its latest version being v0.0.8, which supports both Mac and Windows.
Agent TARS App is our new application focused on browser agents, with its latest version being Agent-TARS-v1.0.0-alpha.7. Since it is still in the technical preview stage, it currently supports only Mac.

Suggestions

Use a model compatible with Agent-TARS: https://github.com/bytedance/UI-TARS-desktop/discussions/377
Future Development: Long-term, we plan to make UI-TARS models, integrate with Agent-TARS. This is under technical research, so stay tuned for updates.

Apr 10 '25 16:04 ulivz

UI-TARS-1.5-7B same issue

May 30 '25 03:05 zhurunhua

UI-TARS-desktop UI-TARS-desktop copied to clipboard

[Bug]: At most 1 image(s) may be provided in one request; 400 Bad Request

Version

Model

Deployment Method

Issue Description

Error Logs

Some clarifications

Suggestions

UI-TARS-desktop
UI-TARS-desktop copied to clipboard