Gemini models via Openrouter not supported
What happened?
Following code snippet works
config = {
"model": "anthropic/claude-3.5-sonnet",
"base_url": "https://openrouter.ai/api/v1",
"model_info": {
"vision": True,
"function_calling": True,
"json_output": False,
"family": "claude-3.5-sonnet"
}
}
model = config["model"]
api_key = settings.OPENROUTER_KEY
base_url = config["base_url"]
model_info=config.get("model_info", {})
model_client = OpenAIChatCompletionClient(
model=model,
api_key=api_key,
base_url=base_url,
model_info=model_info
)
# do other stuff like create agent etc...
#####rest of the code####
response = await agent.on_messages(messages=messages,cancellation_token=cancellation_token)
Above code works well when we change "model": "anthropic/claude-3.5-sonnet" -> model ="openai/gpt-4o-2024-11-20" and "family": "claude-3.5-sonnet" -> to "family": "gpt-4o"
However, when i change model to gemini flash from here - https://openrouter.ai/google/gemini-2.0-flash-001 ie "model" : "google/gemini-2.0-flash-001" and "family" : "gemini-2.0-flash" (picked up from https://microsoft.github.io/autogen/stable//reference/python/autogen_core.models.html#autogen_core.models.ModelInfo) code fails as following. Tried with family "unknown" as well.
venv_autogen_latest/lib/python3.12/site-packages/autogen_agentchat/agents/_assistant_agent.py:416: UserWarning: Resolved model mismatch: google/gemini-2.0-flash-001 != None. Model mapping in autogen_ext.models.openai may be incorrect.
model_result = await self._model_client.create(
=== Exception during agent.on_messages call ===
'NoneType' object is not subscriptable
Traceback (most recent call last):
File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/autogen-ms/agent_backyard.py", line 38, in run_task
response = await agent.on_messages(messages=messages,cancellation_token=cancellation_token)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/venv_autogen_latest/lib/python3.12/site-packages/autogen_agentchat/agents/_assistant_agent.py", line 370, in on_messages
async for message in self.on_messages_stream(messages, cancellation_token):
File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/venv_autogen_latest/lib/python3.12/site-packages/autogen_agentchat/agents/_assistant_agent.py", line 416, in on_messages_stream
model_result = await self._model_client.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/venv_autogen_latest/lib/python3.12/site-packages/autogen_ext/models/openai/_openai_client.py", line 569, in create
choice: Union[ParsedChoice[Any], ParsedChoice[BaseModel], Choice] = result.choices[0]
~~~~~~~~~~~~~~^^^
TypeError: 'NoneType' object is not subscriptable
Error: 'NoneType' object is not subscriptable
Which packages was the bug in?
Python AgentChat (autogen-agentchat>=0.4.0)
AutoGen library version.
Python 0.4.7
Other library version.
No response
Model used
gpt4o, sonnet 3.5, gemini flash 2.0
Model provider
OpenRouter
Other model provider
No response
Python version
3.12
.NET version
None
Operating system
MacOS
@ekzhu do you know if OpenRouter presents all models as openai compatible or is gemini different?
@jackgerrits - openrouter's claim to fame is that they provide a unified API and all models can be accessed via an openai compatible API schema.
https://openrouter.ai/docs/quickstart
https://openrouter.ai/docs/api-reference/overview
verbatim info from above link:
OpenRouter’s request and response schemas are very similar to the OpenAI Chat API, with a few small differences. At a high level, OpenRouter normalizes the schema across models and providers so you only need to learn one.
Another verbatim text from this link - https://openrouter.ai/openai/o1/api
OpenRouter provides an OpenAI-compatible completion API to 300+ models & providers that you can call directly, or using the OpenAI SDK. Additionally, some third-party SDKs are available.
They do say "very similar" but there is a reason why AI community is doubling down on openrouter and LiteLLM..because we want single interface for all AI models and make integrations model agnostic. Hope this helps. In case you find that Gemini's response is not openai compatible then do print the logs here and i will log a bug with openrouter. however it seems to me that it is not even about api response format. thr is some other problem of mapping in autogent
UserWarning: Resolved model mismatch: google/gemini-2.0-flash-001 != None. Model mapping in autogen_ext.models.openai may be incorrect.
copy-pasta :) the exact error again from this ticket so that you can look for it in your code base.
Waiting for this fix! Let's keep building!! Hyped with MS putting best minds of world on this.. so doubling down on Autogen while back stabbing langhain, crewai and smolagents... lfg!
UserWarning: Resolved model mismatch: google/gemini-2.0-flash-001 != None. Model mapping in autogen_ext.models.openai may be incorrect.
Yeah, I think this warning is probably okay, but I could be wrong here.
The error is in the issue indicates that result.choices is None. It would be good to reduce the repro down to just a model client call.
I don't have access to open router at the moment, so I will wait to see what @ekzhu thinks.
you should try to integrate header like
in the example it worked, not tried yet with openrouter
I am getting the same error from OpenRouter when using Claude models. But it works with OpenAI models.
See my response in #5583
At this point I don't know what's the cause of it.
@ekzhu i am hoping that you are changing the family in your code when trying with Claude in the code snippet you pasted here - https://github.com/microsoft/autogen/issues/5583
Coz Claude works just fine. This is the config that works:
{
"model": "anthropic/claude-3.5-sonnet",
"base_url": "https://openrouter.ai/api/v1",
"api_type": "anthropic",
"model_info": {
"vision": True,
"function_calling": True,
"json_output": False,
"family": "claude-3.5-sonnet"
}
}
ignore "api_type" key. Let me know if using above as well doesnt work for you for Claude models. Like i said, Claude works, Gemini doesnt. So we need to be on same page in terms of "reproducibility" of this issue else it will die a slow death and so would my project :D
Awaiting for your response on this...
It's more about the model name rather than the model family. Have you tried with just calling open router directly using the openai library? Because from the error message it seems like the failure happened due to server returned a None in result.choices
I had the same error and inspected the result where the traceback comes from:
result.model_extra
Out[2]:
{'error': {'message': 'Provider returned error',
'code': 400,
'metadata': {'raw': '{"type":"error","error":{"type":"invalid_request_error","message":"Requests which include `tool_use` or `tool_result` blocks must define tools."}}',
'provider_name': 'Google',
'isDownstreamPipeClean': True,
'isErrorUpstreamFault': False}},
'user_id': 'xxx'}
I looked a bit more. I think the problem is:
- Openai allows tool calls to be in the message history, even when the current API call does not include tools for the model
- Some Openrouter models seem to not allow this
These were the messages sent to the LLM when I had the error:
[{'content': 'You are a helpful assistant.', 'role': 'system'},
{'content': 'What is the weather in New York?',
'role': 'user',
'name': 'user'},
{'tool_calls': [{'id': 'toolu_vrtx_01UonpGhPPQbzMNj8JaSREjv',
'function': {'arguments': '{"city": "New York"}', 'name': 'get_weather'},
'type': 'function'}],
'role': 'assistant',
'name': 'weather_agent'},
{'content': 'The weather in New York is 73 degrees and Sunny.',
'role': 'tool',
'tool_call_id': 'toolu_vrtx_01UonpGhPPQbzMNj8JaSREjv'}]
@philippHorn thanks for sharing the messages. This means that the issue lies with openrouter. May be i will write a wrapper in my code that if model is gemini i dont call agent.on_messages(...) but instead get the response using gemini's official python packag and whatever response i get, i attach it to agent's state in format it expects it to be... still thinking how to work around it.
@philippHorn thanks for sharing the messages. This means that the issue lies with openrouter. May be i will write a wrapper in my code that if model is gemini i dont call
agent.on_messages(...)but instead get the response using gemini's official python packag and whatever response i get, i attach it to agent's state in format it expects it to be... still thinking how to work around it.
Can I participate this issue and submit PR to fix it? I want to help the community.
@ravishqureshi You're welcome. By the way the issue is not specific to gemini, I have it on claude as well.
I'd be curious to understand this a bit. What seems to happen:
- The agent makes the first LLM call with the tools included for the LLM to select
- The LLM responds with the tool call and the tool is run
- Another LLM call is done, where the output from the tool is fed into the llm as a tool message, but without giving the LLM the tool to call anymore
Why are the tools not available in the second LLM call? Is it to force the LLM to compile an answer instead and to prevent too many LLM API calls? Does that mean the agent can't call the same tool multiple times within a run? I think the tradeoff here is that the LLM looses access to the tool description, not sure if that is a big problem.
Here is a full script that reproduces the issue, just need to swap the openrouter API key:
import asyncio
from pprint import pprint
from autogen_agentchat.agents import AssistantAgent
from autogen_core.models import ModelFamily, ModelInfo
from autogen_ext.models.openai import (
OpenAIChatCompletionClient,
)
from langsmith.wrappers import wrap_openai
info: ModelInfo = {
"vision": False,
"function_calling": True,
"json_output": False,
"family": ModelFamily.CLAUDE_3_5_SONNET,
}
model_client = OpenAIChatCompletionClient(
model="anthropic/claude-3.5-sonnet",
api_key="xxxx",
model_info=info,
base_url="https://openrouter.ai/api/v1",
)
def get_weather(city: str) -> str:
"""Get the weather for a given city."""
return f"The weather is 73 degrees and Sunny."
agent = AssistantAgent(
name="weather_agent",
model_client=model_client,
tools=[get_weather],
system_message="You are a helpful assistant.",
reflect_on_tool_use=True,
)
model_client._client = wrap_openai(model_client._client)
def main() -> None:
result = asyncio.run(agent.run(task="What is the weather in New York?"))
pprint(result)
main()
I've confirmed that the crash goes away if I add this line, giving the LLM tool access on the second call:
@philippHorn the reason for not including the tools in the reflection step is because we want to force a text response.
You can set reflect_on_tool_use=False to disable the second inference and repeatedly call the agent without a task or message to get agent to repeatedly execute tools.
For model provider compatibility issue, is this only related to Open Router? Because if you use Gemini directly from OpenAIChatCompletionClient it works fine.
A more complete fix would be to add tool_choice parameter in the extra_create_args in the reflection inference call, and set tool_choice=None. However, the syntax is only for OpenAI and not translated to other providers. https://platform.openai.com/docs/api-reference/chat/create#chat-create-tool_choice. We will need to add a new tool_choice parameter to ChatCompletionClient base class for this to work.
Thanks, for now I use this as a workaround:
for attempt in range(MAX_LLM_CALLS):
result = asyncio.run(agent.run(task=None))
if any(isinstance(message, TextMessage) for message in result.messages):
break
else:
raise ValueError("Max attempts exceeded without LLM answer")
it seems to work well in practice, but probably like this it is not production ready
@philippHorn, I created an issue to address this: https://github.com/microsoft/autogen/issues/5732
Some feedback after using the approach without reflect=True:
- Gernerally it works pretty well, at least for my use-case
- But here and there I get cases where an agent decides to call tools again and again until the max attempts are reached
- For that it would be nice to have a way to enforce a text answer. Right now I return a static "I don't know" answer in those cases, but it would be nicer generate a dynamic answer with the agent. It would adhere to the prompt then and incorporate context.
Created a new issue to support built-in iteration in AssistantAgent #6268