autogen Gemini models via Openrouter not supported

What happened?

Following code snippet works

config =  {
            "model": "anthropic/claude-3.5-sonnet",
            "base_url": "https://openrouter.ai/api/v1",
            "model_info": {
                "vision": True,
                "function_calling": True,
                "json_output": False,
                "family": "claude-3.5-sonnet"
            }
        }
model = config["model"]
api_key = settings.OPENROUTER_KEY
base_url = config["base_url"]
model_info=config.get("model_info", {})

model_client = OpenAIChatCompletionClient(
      model=model,
      api_key=api_key,
      base_url=base_url,
      model_info=model_info
  )
# do other stuff like create agent etc...
#####rest of the code####
response = await agent.on_messages(messages=messages,cancellation_token=cancellation_token)

Above code works well when we change "model": "anthropic/claude-3.5-sonnet" -> model ="openai/gpt-4o-2024-11-20" and "family": "claude-3.5-sonnet" -> to "family": "gpt-4o"

However, when i change model to gemini flash from here - https://openrouter.ai/google/gemini-2.0-flash-001 ie "model" : "google/gemini-2.0-flash-001" and "family" : "gemini-2.0-flash" (picked up from https://microsoft.github.io/autogen/stable//reference/python/autogen_core.models.html#autogen_core.models.ModelInfo) code fails as following. Tried with family "unknown" as well.

venv_autogen_latest/lib/python3.12/site-packages/autogen_agentchat/agents/_assistant_agent.py:416: UserWarning: Resolved model mismatch: google/gemini-2.0-flash-001 != None. Model mapping in autogen_ext.models.openai may be incorrect.
  model_result = await self._model_client.create(
=== Exception during agent.on_messages call ===
'NoneType' object is not subscriptable
Traceback (most recent call last):
  File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/autogen-ms/agent_backyard.py", line 38, in run_task
    response = await agent.on_messages(messages=messages,cancellation_token=cancellation_token)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/venv_autogen_latest/lib/python3.12/site-packages/autogen_agentchat/agents/_assistant_agent.py", line 370, in on_messages
    async for message in self.on_messages_stream(messages, cancellation_token):
  File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/venv_autogen_latest/lib/python3.12/site-packages/autogen_agentchat/agents/_assistant_agent.py", line 416, in on_messages_stream
    model_result = await self._model_client.create(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ravishq/Library/CloudStorage/[email protected]/My Drive/iamai/venv_autogen_latest/lib/python3.12/site-packages/autogen_ext/models/openai/_openai_client.py", line 569, in create
    choice: Union[ParsedChoice[Any], ParsedChoice[BaseModel], Choice] = result.choices[0]
                                                                        ~~~~~~~~~~~~~~^^^
TypeError: 'NoneType' object is not subscriptable
Error: 'NoneType' object is not subscriptable

Which packages was the bug in?

Python AgentChat (autogen-agentchat>=0.4.0)

AutoGen library version.

Python 0.4.7

Other library version.

No response

Model used

gpt4o, sonnet 3.5, gemini flash 2.0

Model provider

OpenRouter

Other model provider

No response

Python version

3.12

.NET version

None

Operating system

MacOS

Feb 19 '25 22:02 ravishqureshi

@ekzhu do you know if OpenRouter presents all models as openai compatible or is gemini different?

Feb 20 '25 18:02 jackgerrits

@jackgerrits - openrouter's claim to fame is that they provide a unified API and all models can be accessed via an openai compatible API schema.

https://openrouter.ai/docs/quickstart https://openrouter.ai/docs/api-reference/overview verbatim info from above link: OpenRouter’s request and response schemas are very similar to the OpenAI Chat API, with a few small differences. At a high level, OpenRouter normalizes the schema across models and providers so you only need to learn one.

Another verbatim text from this link - https://openrouter.ai/openai/o1/api OpenRouter provides an OpenAI-compatible completion API to 300+ models & providers that you can call directly, or using the OpenAI SDK. Additionally, some third-party SDKs are available.

They do say "very similar" but there is a reason why AI community is doubling down on openrouter and LiteLLM..because we want single interface for all AI models and make integrations model agnostic. Hope this helps. In case you find that Gemini's response is not openai compatible then do print the logs here and i will log a bug with openrouter. however it seems to me that it is not even about api response format. thr is some other problem of mapping in autogent UserWarning: Resolved model mismatch: google/gemini-2.0-flash-001 != None. Model mapping in autogen_ext.models.openai may be incorrect.

copy-pasta :) the exact error again from this ticket so that you can look for it in your code base.

Waiting for this fix! Let's keep building!! Hyped with MS putting best minds of world on this.. so doubling down on Autogen while back stabbing langhain, crewai and smolagents... lfg!

Feb 20 '25 20:02 ravishqureshi

UserWarning: Resolved model mismatch: google/gemini-2.0-flash-001 != None. Model mapping in autogen_ext.models.openai may be incorrect.

Yeah, I think this warning is probably okay, but I could be wrong here.

The error is in the issue indicates that result.choices is None. It would be good to reduce the repro down to just a model client call.

I don't have access to open router at the moment, so I will wait to see what @ekzhu thinks.

Feb 20 '25 20:02 jackgerrits

you should try to integrate header like

in the example it worked, not tried yet with openrouter

Feb 21 '25 01:02 Zochory

I am getting the same error from OpenRouter when using Claude models. But it works with OpenAI models.

See my response in #5583

At this point I don't know what's the cause of it.

Feb 21 '25 19:02 ekzhu

@ekzhu i am hoping that you are changing the family in your code when trying with Claude in the code snippet you pasted here - https://github.com/microsoft/autogen/issues/5583

Coz Claude works just fine. This is the config that works:

{
            "model": "anthropic/claude-3.5-sonnet",
            "base_url": "https://openrouter.ai/api/v1",
            "api_type": "anthropic",
            "model_info": {
                "vision": True,
                "function_calling": True,
                "json_output": False,
                "family": "claude-3.5-sonnet"
            }
        }

ignore "api_type" key. Let me know if using above as well doesnt work for you for Claude models. Like i said, Claude works, Gemini doesnt. So we need to be on same page in terms of "reproducibility" of this issue else it will die a slow death and so would my project :D

Awaiting for your response on this...

Feb 21 '25 21:02 ravishqureshi

It's more about the model name rather than the model family. Have you tried with just calling open router directly using the openai library? Because from the error message it seems like the failure happened due to server returned a None in result.choices

Feb 21 '25 22:02 ekzhu

I had the same error and inspected the result where the traceback comes from:

result.model_extra
Out[2]: 
{'error': {'message': 'Provider returned error',
  'code': 400,
  'metadata': {'raw': '{"type":"error","error":{"type":"invalid_request_error","message":"Requests which include `tool_use` or `tool_result` blocks must define tools."}}',
   'provider_name': 'Google',
   'isDownstreamPipeClean': True,
   'isErrorUpstreamFault': False}},
 'user_id': 'xxx'}

Feb 22 '25 22:02 philippHorn

I looked a bit more. I think the problem is:

Openai allows tool calls to be in the message history, even when the current API call does not include tools for the model
Some Openrouter models seem to not allow this

These were the messages sent to the LLM when I had the error:

[{'content': 'You are a helpful assistant.', 'role': 'system'},
 {'content': 'What is the weather in New York?',
  'role': 'user',
  'name': 'user'},
 {'tool_calls': [{'id': 'toolu_vrtx_01UonpGhPPQbzMNj8JaSREjv',
    'function': {'arguments': '{"city": "New York"}', 'name': 'get_weather'},
    'type': 'function'}],
  'role': 'assistant',
  'name': 'weather_agent'},
 {'content': 'The weather in New York is 73 degrees and Sunny.',
  'role': 'tool',
  'tool_call_id': 'toolu_vrtx_01UonpGhPPQbzMNj8JaSREjv'}]

Feb 22 '25 22:02 philippHorn

@philippHorn thanks for sharing the messages. This means that the issue lies with openrouter. May be i will write a wrapper in my code that if model is gemini i dont call agent.on_messages(...) but instead get the response using gemini's official python packag and whatever response i get, i attach it to agent's state in format it expects it to be... still thinking how to work around it.

Feb 24 '25 23:02 ravishqureshi

@philippHorn thanks for sharing the messages. This means that the issue lies with openrouter. May be i will write a wrapper in my code that if model is gemini i dont call agent.on_messages(...) but instead get the response using gemini's official python packag and whatever response i get, i attach it to agent's state in format it expects it to be... still thinking how to work around it.

Can I participate this issue and submit PR to fix it? I want to help the community.

Feb 25 '25 09:02 pengjunfeng11

@ravishqureshi You're welcome. By the way the issue is not specific to gemini, I have it on claude as well.

I'd be curious to understand this a bit. What seems to happen:

The agent makes the first LLM call with the tools included for the LLM to select
The LLM responds with the tool call and the tool is run
Another LLM call is done, where the output from the tool is fed into the llm as a tool message, but without giving the LLM the tool to call anymore

Why are the tools not available in the second LLM call? Is it to force the LLM to compile an answer instead and to prevent too many LLM API calls? Does that mean the agent can't call the same tool multiple times within a run? I think the tradeoff here is that the LLM looses access to the tool description, not sure if that is a big problem.

Here is a full script that reproduces the issue, just need to swap the openrouter API key:

import asyncio
from pprint import pprint

from autogen_agentchat.agents import AssistantAgent
from autogen_core.models import ModelFamily, ModelInfo
from autogen_ext.models.openai import (
    OpenAIChatCompletionClient,
)
from langsmith.wrappers import wrap_openai

info: ModelInfo = {
    "vision": False,
    "function_calling": True,
    "json_output": False,
    "family": ModelFamily.CLAUDE_3_5_SONNET,
}
model_client = OpenAIChatCompletionClient(
    model="anthropic/claude-3.5-sonnet",
    api_key="xxxx",
    model_info=info,
    base_url="https://openrouter.ai/api/v1",
)


def get_weather(city: str) -> str:
    """Get the weather for a given city."""
    return f"The weather is 73 degrees and Sunny."


agent = AssistantAgent(
    name="weather_agent",
    model_client=model_client,
    tools=[get_weather],
    system_message="You are a helpful assistant.",
    reflect_on_tool_use=True,
)

model_client._client = wrap_openai(model_client._client)


def main() -> None:
    result = asyncio.run(agent.run(task="What is the weather in New York?"))
    pprint(result)


main()

I've confirmed that the crash goes away if I add this line, giving the LLM tool access on the second call:

Feb 25 '25 18:02 philippHorn

@philippHorn the reason for not including the tools in the reflection step is because we want to force a text response.

You can set reflect_on_tool_use=False to disable the second inference and repeatedly call the agent without a task or message to get agent to repeatedly execute tools.

For model provider compatibility issue, is this only related to Open Router? Because if you use Gemini directly from OpenAIChatCompletionClient it works fine.

A more complete fix would be to add tool_choice parameter in the extra_create_args in the reflection inference call, and set tool_choice=None. However, the syntax is only for OpenAI and not translated to other providers. https://platform.openai.com/docs/api-reference/chat/create#chat-create-tool_choice. We will need to add a new tool_choice parameter to ChatCompletionClient base class for this to work.

Feb 25 '25 18:02 ekzhu

Thanks, for now I use this as a workaround:

    for attempt in range(MAX_LLM_CALLS):
        result = asyncio.run(agent.run(task=None))
        if any(isinstance(message, TextMessage) for message in result.messages):
            break
    else:
        raise ValueError("Max attempts exceeded without LLM answer")

it seems to work well in practice, but probably like this it is not production ready

Feb 26 '25 14:02 philippHorn

@philippHorn, I created an issue to address this: https://github.com/microsoft/autogen/issues/5732

Feb 26 '25 19:02 ekzhu

Some feedback after using the approach without reflect=True:

Gernerally it works pretty well, at least for my use-case
But here and there I get cases where an agent decides to call tools again and again until the max attempts are reached
For that it would be nice to have a way to enforce a text answer. Right now I return a static "I don't know" answer in those cases, but it would be nicer generate a dynamic answer with the agent. It would adhere to the prompt then and incorporate context.

Apr 10 '25 12:04 philippHorn

Created a new issue to support built-in iteration in AssistantAgent #6268

Apr 10 '25 19:04 ekzhu