Compatibility Issues with Qwen Series Models (VL, QVQ-max) via LiteLLM Proxy
I've been attempting to integrate Alibaba Cloud's Qwen series models (specifically dashscope/qwen-vl-max and dashscope/qvq-max) into Bytebot using the recommended LiteLLM proxy setup (local Docker Compose). While basic connectivity was established after significant debugging (related to agent authentication and build caching), severe compatibility issues remain with these specific models, preventing their effective use.
1. Qwen-VL Models (e.g., qwen-vl-max) - Non-Standard Tool Calling:
- Problem: When
bytebot-agentsends a request withtoolsdefined andtool_choice: "auto",qwen-vl-max(via LiteLLM) does not populate the standardtool_callsfield in the response. Instead, it returnstool_calls: nulland embeds the intended tool calls as JSON code blocks (e.g.,```json { "name": "...", "input": {...} } ```) directly within themessage.contentfield, often mixed with natural language "thinking" text. - Impact:
bytebot-agent's current response parser (formatChatCompletionResponseinproxy.service.ts) only checks themessage.tool_callsfield. Since it's null, the agent fails to recognize or execute the tool calls, treating the entirecontent(including the JSON blocks) as plain text output to the user. - Debugging Done:
- Confirmed via direct
curl/Pythonrequeststolitellm-proxythat the issue persists even whenbytebot-agentis bypassed, proving it's an incompatibility between Qwen-VL's output format and the standard expected by the agent. - Ensured
reasoning_effortparameter was removed frombytebot-agentsource code (proxy.service.ts) via--no-cachebuilds, ruling it out as the cause. - Configured
tool_choice: "auto"via LiteLLM UI ("Default Parameters") for the model, which did not resolve the issue.
- Confirmed via direct
2. QVQ/QVQ-max Models - HTTPS Connection Error:
- Problem: Attempts to use
dashscope/qvq-maxconsistently fail with the errorlitellm.BadRequestError: DashscopeException - current user api does not support http call. - Debugging Done:
- Verified multiple times that the
api_baseconfigured in the LiteLLM UI for this model is correctly set to HTTPS:https://dashscope.aliyuncs.com/compatible-mode/v1. - This error occurs even when the Docker Desktop global proxy and any system-level proxies (like Clash) are completely disabled, ruling out proxy HTTPS-stripping.
- Other Dashscope models (like
qwen-vl-max) connect successfully over HTTPS using the same LiteLLM proxy setup.
- Verified multiple times that the
- Conclusion: This suggests a specific issue either with the Dashscope endpoint for
qvq-maxwhen accessed via the OpenAI compatibility layer, or with how the LiteLLM adapter handles HTTPS requests only for this specific model variant.
Environment:
- Bytebot: Built locally from source (recent
edgeequivalent). - LiteLLM: Running via Docker using
ghcr.io/berriai/litellm:main-stableimage. - Models Tested:
dashscope/qwen-vl-max,dashscope/qvq-max. - Setup: Local Docker Compose on Windows, using the project's provided
postgrescontainer for bothbytebot-agentandlitellm-proxydatabases (bytebotdbandlitellm_logs_dbrespectively). LiteLLM configured withmaster_keyandencryption_key.
Workarounds Attempted:
- Qwen-VL: Manually modified
bytebot-agent'sformatChatCompletionResponsefunction inproxy.service.tsto parse```json ... ```blocks frommessage.contentwhenmessage.tool_callsis null. (Code based on our discussion can be provided if needed). This works but requires modifying agent source. - QVQ-max: No workaround found. Model remains unusable due to the persistent HTTPS error.
Suggestions:
- Enhance
bytebot-agentParser: UpdateformatChatCompletionResponseto include fallback logic that parses JSON code blocks frommessage.contentifmessage.tool_callsis null/empty. This would provide out-of-the-box compatibility with models like Qwen-VL. - Investigate QVQ HTTPS Issue: This seems like a deeper issue, potentially within LiteLLM's Dashscope adapter or the Dashscope endpoint itself. Collaboration with the LiteLLM team might be needed.
- Document Compatibility: Update Bytebot documentation regarding known compatibility issues with specific Qwen models and potential workarounds (like the agent code modification).
- Agent Authentication: Address the underlying issue where
bytebot-agentignores standard proxy keys and requires hardcoding or theOPENAI_API_KEYworkaround.
Thanks for looking into this. Qwen models are important in certain regions, and improving compatibility would be very beneficial.
This is particularly important. To the extent of my knowledge, Qwen VL is also better suited for computer-using agent (with the latest open-source model Qwen3 VL). So, integrating it is definitely very crucial. I am using OpenRouter in my bytebot fork - to use the deployed Qwen3VL. Btw, a quick question, do you think using litellm is better?
Hi @BuesrB
Your approach using OpenRouter in a Bytebot fork sounds quite interesting, especially since you mentioned Qwen3 VL seems well-suited for this type of agent work.
I took a look at your GitHub profile hoping to find the fork you mentioned and learn more, but I wasn't able to locate it. Would you be willing to share a link to your fork if it's public?
More specifically, I'm particularly curious about how you handled the Qwen VL tool calling incompatibility we discussed.
Does OpenRouter handle this adaptation automatically for Qwen models, or did you need to implement specific adapter logic (e.g., parsing the content field) or perhaps use targeted prompt engineering within your fork to get the tool calls working reliably? Thanks!
@Uc207Pr4f57t9-251
I made a small change to support the passing of model name when using litellm and was able to get btyebot to work with qwen3-vl. It can open a browser, but fails on other tool calls due to:
https://github.com/bytebot-ai/bytebot/issues/153
So it looks like there is some command mapping to do.
Hi @BuesrB Your approach using OpenRouter in a Bytebot fork sounds quite interesting, especially since you mentioned Qwen3 VL seems well-suited for this type of agent work. I took a look at your GitHub profile hoping to find the fork you mentioned and learn more, but I wasn't able to locate it. Would you be willing to share a link to your fork if it's public? More specifically, I'm particularly curious about how you handled the Qwen VL tool calling incompatibility we discussed. Does OpenRouter handle this adaptation automatically for Qwen models, or did you need to implement specific adapter logic (e.g., parsing the
contentfield) or perhaps use targeted prompt engineering within your fork to get the tool calls working reliably? Thanks!
The fork is here https://github.com/kira-id/cua.kira . We have improved it - it works now with Qwen3VL, takeover desktop directly on home, also added Blender on Desktop, merged some PR here that actually is useful and more. Could you please take a look and see if it works for you?
The adapter logic is modified accordingly continuing the PR https://github.com/bytebot-ai/bytebot/pull/145 . I do believe that this adapter logic should be possible to be standardized among Grok, Anthropic, OpenAI, etc - just with minimal modification for each providers. This should be the intention of the proxy approach I suppose. At the fork, they are still separated having better clarity and easier development.
The main next step that I see, comes from the fact that no LLM so far that I have tested is able to do CUA that well. Particularly, I see it misclicks buttons a lot, it often misses some pixels a way. Almost there, but not quite there. So, I suppose this is the next step in making CUA actually useful.
I made a small change to support the passing of model name when using litellm and was able to get btyebot to work with qwen3-vl. It can open a browser, but fails on other tool calls due to:
So it looks like there is some command mapping to do.
The command mapping is definitely the big thing to do. We have done it here https://github.com/kira-id/cua.kira . I think I have tagged you in another issue haha
Hi @BuesrB Your approach using OpenRouter in a Bytebot fork sounds quite interesting, especially since you mentioned Qwen3 VL seems well-suited for this type of agent work. I took a look at your GitHub profile hoping to find the fork you mentioned and learn more, but I wasn't able to locate it. Would you be willing to share a link to your fork if it's public? More specifically, I'm particularly curious about how you handled the Qwen VL tool calling incompatibility we discussed. Does OpenRouter handle this adaptation automatically for Qwen models, or did you need to implement specific adapter logic (e.g., parsing the
contentfield) or perhaps use targeted prompt engineering within your fork to get the tool calls working reliably? Thanks!The fork is here https://github.com/kira-id/cua.kira . We have improved it - it works now with Qwen3VL, takeover desktop directly on home, also added Blender on Desktop, merged some PR here that actually is useful and more. Could you please take a look and see if it works for you?
The adapter logic is modified accordingly continuing the PR #145 . I do believe that this adapter logic should be possible to be standardized among Grok, Anthropic, OpenAI, etc - just with minimal modification for each providers. This should be the intention of the proxy approach I suppose. At the fork, they are still separated having better clarity and easier development.
The main next step that I see, comes from the fact that no LLM so far that I have tested is able to do CUA that well. Particularly, I see it misclicks buttons a lot, it often misses some pixels a way. Almost there, but not quite there. So, I suppose this is the next step in making CUA actually useful.
Was just about to leave a comment to reply but thank you so much @samkoesnadi , and yes I totally agree with you. Particularly the misclicks and the incorrectness of the coordinates from the cursor are creating lots of errors. But yes, it is almost there. So anyone who has interest and curious are welcomed to test, try out and leave a comment :) https://github.com/kira-id/cua.kira
May I ask how to solve this error when I'm using it?
bytebot-agent | If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.. Received Model Group=openrouter/qwen/qwen3-vl-32b-instruct
bytebot-agent | Available Model Group Fallbacks=None
bytebot-agent | Error: 400 litellm.UnsupportedParamsError: openrouter does not support parameters: ['reasoning_effort'], for model=qwen/qwen3-vl-32b-instruct. To drop these, set litellm.drop_params=True or for proxy:
bytebot-agent |
bytebot-agent | litellm_settings: bytebot-agent | drop_params: true
bytebot-agent | .
bytebot-agent | If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.. Received Model Group=openrouter/qwen/qwen3-vl-32b-instruct
bytebot-agent | Available Model Group Fallbacks=None
bytebot-agent | at APIError.generate (/app/bytebot-agent/node_modules/openai/core/error.js:45:20)
bytebot-agent | at OpenAI.makeStatusError (/app/bytebot-agent/node_modules/openai/client.js:158:32)
bytebot-agent | at OpenAI.makeRequest (/app/bytebot-agent/node_modules/openai/client.js:301:30)
bytebot-agent | at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
bytebot-agent | at async ProxyService.generateMessage (/app/bytebot-agent/dist/proxy/proxy.service.js:44:32)
bytebot-agent | at async AgentProcessor.runIteration (/app/bytebot-agent/dist/agent/agent.processor.js:137:29)
Hi @liliangdao,
This is a known issue that stems from the bytebot-agent's source code, not your model configuration. The agent hardcodes a reasoning_effort: 'high' parameter in all its requests, which openrouter does not support, leading to the 400 UnsupportedParamsError you're seeing.
As you noted in the logs, the fix is to configure your litellm-proxy to silently drop this unsupported parameter before it reaches OpenRouter.
The Solution
-
Open your
litellm-proxyconfiguration file, located atpackages/bytebot-llm-proxy/litellm-config.yaml. -
Add the
drop_params: trueline inside thelitellm_settings:block.litellm_settings: debug: true detailed_debug: true encryption_key: "your_encryption_key" # or other settings... # --- Add this line --- drop_params: true # --------------------- -
After saving the file, you must rebuild your Docker containers for the
litellm-proxyto load this new setting. Run:# Make sure to include all your .yml files (e.g., -f docker-compose.yml) docker compose up -d --buildThis will resolve the
reasoning_efforterror.
A Note on This Topic
Just a heads-up, this specific reasoning_effort / drop_params bug has been discussed in detail in Issue #151 (as seen in this comment: https://github.com/bytebot-ai/bytebot/issues/151#issuecomment-3466966448).
To keep the conversation focused, it's best to discuss any further issues related to this specific parameter in that thread. This current issue is tracking the separate, more complex problem of Qwen models returning a non-standard tool call format (i.e., embedding JSON in the content field instead of using the tool_calls array).
@samkoesnadi
Thanks for sharing your fork (https://github.com/kira-id/cua.kira).
I cloned your repo and ran docker-compose up using the main docker-compose.yml file. However, when the UI starts, it seems unable to load any of the default models (the model list appears empty).
I tried to find the setup instructions for OpenRouter but couldn't locate a specific guide or .env.example for it. I did see your Pull Request #2 ("cursor openrouter"), so I assumed the required variable might be OPENROUTER_API_KEY.
I added my key to a .env file in the root (OPENROUTER_API_KEY=sk-or-XXXX...), other configs in docker/.env.example and ran docker compose up -d again, but I'm still seeing the same issue (no models loading).
I feel like I must be missing a configuration step. Is OPENROUTER_API_KEY the correct environment variable, or is there another file I need to set up to get the agent to connect to OpenRouter and load the Qwen3VL models?
docker ouput:
[Nest] 18 - 10/30/2025, 3:12:19 PM WARN [AgentAnalyticsService] BYTEBOT_ANALYTICS_ENDPOINT is not set. Analytics service disabled.
[Nest] 18 - 10/30/2025, 3:12:19 PM WARN [AnthropicService] ANTHROPIC_API_KEY is not set. AnthropicService will not work properly.
[Nest] 18 - 10/30/2025, 3:12:19 PM WARN [OpenAIService] OPENAI_API_KEY is not set. OpenAIService will not work properly.
[Nest] 18 - 10/30/2025, 3:12:19 PM WARN [GoogleService] GEMINI_API_KEY is not set. GoogleService will not work properly.
[Nest] 18 - 10/30/2025, 3:12:19 PM WARN [ProxyService] BYTEBOT_LLM_PROXY_URL is not set. ProxyService will not work properly.
Seems no OPENROUTER_API_KEY is loaded?
@samkoesnadi Thanks for sharing your fork (
https://github.com/kira-id/cua.kira). I cloned your repo and randocker-compose upusing the maindocker-compose.ymlfile. However, when the UI starts, it seems unable to load any of the default models (the model list appears empty).I tried to find the setup instructions for OpenRouter but couldn't locate a specific guide or
.env.examplefor it. I did see your Pull Request #2 ("cursor openrouter"), so I assumed the required variable might beOPENROUTER_API_KEY.I added my key to a
.envfile in the root (OPENROUTER_API_KEY=sk-or-XXXX...), other configs in docker/.env.example and randocker compose up -dagain, but I'm still seeing the same issue (no models loading).I feel like I must be missing a configuration step. Is
OPENROUTER_API_KEYthe correct environment variable, or is there another file I need to set up to get the agent to connect to OpenRouter and load the Qwen3VL models?docker ouput:
[Nest] 18 - 10/30/2025, 3:12:19 PM WARN [AgentAnalyticsService] BYTEBOT_ANALYTICS_ENDPOINT is not set. Analytics service disabled. [Nest] 18 - 10/30/2025, 3:12:19 PM WARN [AnthropicService] ANTHROPIC_API_KEY is not set. AnthropicService will not work properly. [Nest] 18 - 10/30/2025, 3:12:19 PM WARN [OpenAIService] OPENAI_API_KEY is not set. OpenAIService will not work properly. [Nest] 18 - 10/30/2025, 3:12:19 PM WARN [GoogleService] GEMINI_API_KEY is not set. GoogleService will not work properly. [Nest] 18 - 10/30/2025, 3:12:19 PM WARN [ProxyService] BYTEBOT_LLM_PROXY_URL is not set. ProxyService will not work properly.Seems no OPENROUTER_API_KEY is loaded?
Hi, thanks for trying it out. You mentioned you put the .env in root? Maybe that's the culprit, it should be under docker/ . So, it should be docker/.env . I hope this will fix it..