Response terminates prematurely when using Gemini 3 via LiteLLM
Description
When using gemini-3-flash-preview through a LiteLLM proxy, OpenCode stops processing the response as soon as the model triggers a tool call. The model provides a reasoning block and a tool call, but OpenCode does not seem to execute the requested tool (e.g., read) and the interaction hangs or terminates without output.
The model response from LiteLLM looks like this (shortened for clarity):
{
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_023b94f976e14e3a8711dd9c9864",
"type": "function",
"function": {
"name": "read",
"arguments": "{\"filePath\": \"/path/to/file/test.txt\"}"
},
"provider_specific_fields": {
"thought_signature": "..."
}
}
],
"reasoning_content": "**Synthesizing Knowledge Bases**\n\nI've been analyzing..."
},
"finish_reason": "stop"
}
]
}
The issue does NOT occur when connecting Gemini 3 directly to OpenCode (without LiteLLM)
OpenCode version
1.0.203
LiteLLM Version
1.80.11
Steps to reproduce
- Set up LiteLLM with a Gemini 3 model - for example here is my litellm config:
model_list:
- model_name: gemini/gemini-3-flash-preview
litellm_params:
model: gemini/gemini-3-flash-preview
api_key: xxx
drop_params: true
- Configure OpenCode to use the LiteLLM endpoint:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"litellm": {
"npm": "@ai-sdk/openai-compatible",
"name": "litellm",
"options": {
"baseURL": "https://localhost:4000/v1",
"apiKey": "sk-xxx"
},
"models": {
"gemini/gemini-3-flash-preview": {
"name": "gemini/gemini-3-flash-preview",
"options": {
"reasoningEffort": "high"
}
}
}
}
}
}
- Ask a question that requires reading a file or using a tool.
Observe that the process stops and no file is read, despite the model requesting it in the JSON.
Screenshot and/or share link
No response
Operating System
Windows 11 with WSL 2
Terminal
No response
This issue might be a duplicate of existing issues. Please check:
- #3365: Opencode with litellm just stops before finishing the task (closed - similar symptom with LiteLLM proxy causing OpenCode to stop mid-task)
- #4832: [BUG]: Gemini 3 Pro function calling fails - missing
thoughtSignaturesupport (Gemini 3 tool calling compatibility issue) - #3596: SSE Stream Bug: Out-of-Order thinking_delta via LiteLLM → AWS Bedrock (LiteLLM proxy causing response handling issues)
- #2915: LiteLLM error: Anthropic doesn't support tool calling without tools= param specified (LiteLLM proxy integration issues)
Feel free to ignore if none of these address your specific case.
Do not use the @ai-sdk/openai-compatible provider with Gemini on LiteLLM. Use @ai-sdk/google instead
@emerzon Thanks for the suggestion, but using @ai-sdk/google would bypass LiteLLM entirely and connect directly to Gemini.
The whole point here is to use LiteLLM as a proxy. LiteLLM exposes an OpenAI-compatible API regardless of the backend model, so @ai-sdk/openai-compatible is the correct choice.
The issue seems to be with how OpenCode handles Gemini 3's special response fields (reasoning_content, thought_signature) when passed through LiteLLM. This configuration works fine with Claude Code and Continue.dev, so it appears to be an OpenCode specific issue.
@themw123 This is not true, you can still set the base URL and it will go over LiteLLM, but use the Gemini request format without OpenAI format translation. I am using it exactly like that.
"provider": {
"litellm-google": {
"npm": "@ai-sdk/google",
"name": "LiteLLM Google",
"options": {
"baseURL": "https://litellm.instance"
},
"models": {
"gemini-3-pro-high": {
"id": "gemini-3-pro-preview",
"name": "Gemini 3 Pro Preview (High Thinking)",
"options": {
"thinkingConfig": {
"thinkingLevel": "high",
"includeThoughts": true
}
}
},
i tried to use the passthrough config from emerzon but it did not work
Connecting to litellm proxy using openai compatible it works but it stops constantly is not usable
i tried to use the passthrough config from emerzon but it did not work
Connecting to litellm proxy using openai compatible it works but it stops constantly is not usable
I have been using this config without any issues. Which issues did you had?
the outcome of my tests indicate that gemini api is not supported for passthrough api so i tried using vertex: https://docs.litellm.ai/docs/pass_through/vertex_ai
so i configured litellm like this:
config.yaml
model_list:
# Vertex AI
- model_name: vertex_ai/*
litellm_params:
model: vertex_ai/*
vertex_project: "xxxxxxxxxxxxxxxxxx"
vertex_location: "global"
vertex_credentials: os.environ/GOOGLE_APPLICATION_CREDENTIALS
use_in_pass_through: true
router_settings:
routing_strategy: simple-shuffle
num_retries: 2
timeout: 300
retry_after: 10
optional_pre_call_checks: ["responses_api_deployment_check"]
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY
database_url: os.environ/DATABASE_URL
store_model_in_db: true
auto_update_model_cost_map: true
store_prompts_in_spend_logs: true
use_x_forwarded_for: true
litellm_settings:
max_budget: 100
budget_duration: 30d
timezone: "Australia/Sydney"
drop_params: True
cache: True
cache_params:
type: redis
host: localhost
port: 6379
url: redis://localhost:6379/0
check_provider_endpoint: false
num_retries: 3
retry_after: 10
request_timeout: 300
allowed_fails: 2
cooldown_time: 10
my opencode setting is similar to yours but i noticed you didnt define authentication and i am getting an authentication issue because my litellm proxy requires an API key, have you been able to get it to work if you require authentication in litellm proxy?
"dr-google": {
"npm": "@ai-sdk/google",
"name": "LiteLLM Google (passthrough)",
"options": {
"baseURL": "http://192.168.1.212:4000/vertex_ai/",
"apiKey": "sk-xyzabc",
"headers": {
"x-litellm-api-key": "Bearer sk-xyzabc"
}
},
"models": {
"vertex_ai/gemini-3-pro-preview": {
"name": "Gemini 3 Pro (Native)",
"limit": {
"context": 1048576,
"output": 65536
},
"cost": {
"input": 2,
"output": 12
},
"options": {
"thinkingConfig": {
"thinkingLevel": "high",
"includeThoughts": true
}
}
},
I am using with Vertex, but I don't have use_in_pass_through: true in my config.
I also have individual model entries for each model: Ie. gemini-3-pro-preview, gemini-3-flash-preview, etc.
My auth to vertex is handled via env vars: GOOGLE_APPLICATION_CREDENTIALS (pointing to the keyfile with credentails), VERTEX_PROJECT and VERTEX_LOCATION, but I suppose this wont matter much.
For the client authentication you should not set the credentials in the config file, you should use the /connect option later in the UI to provide the API key
thanks @emerzon I've been running in circles trying to get that to work ! I was also using the open-ai compatible model. I don't seem to get cost reported on opencode though with this setup, even though my LiteLLM instance returns token usage - any idea how to fix that too?
I don't seem to get cost reported on opencode though with this setup, even though my LiteLLM instance returns token usage - any idea how to fix that too?
I think the only way so far is to manually set costs in the model definition:
"litellm-google": {
"npm": "@ai-sdk/google",
"name": "LiteLLM (Google)",
"options": {
"baseURL": "https://llm.instance"
},
"models": {
"gemini-3-pro-preview": {
"name": "Gemini 3 Pro Preview",
"cost": {
"input": 2,
"output": 12,
"cache_read": 0.2
},
"limit": {
"context": 1000000,
"output": 65536
},
"options": {
"includeThoughts": true
},
"variants": {
"high": {
"options": {
"thinkingConfig": {
"thinkingLevel": "high"
}
}
},
"low": {
"options": {
"thinkingConfig": {
"thinkingLevel": "low"
}
}
},
}
thanks @emerzon that worked
upgraded to newer version and now it seems to be fixed even with "npm": "@ai-sdk/openai-compatible"