autogen from autogen_ext.models.ollama import OllamaChatCompletionClient init

What happened?

success， but use OllamaChatCompletionClient fail

Which packages was the bug in?

Python Extensions (autogen-ext)

AutoGen library version.

Python dev (main branch)

Other library version.

No response

Model used

No response

Model provider

None

Other model provider

No response

Python version

None

.NET version

None

Operating system

None

Mar 31 '25 08:03 xsw1006931693

First of all, this is not a bug. haha

When using a model name that is not a valid OpenAI model, you need to provide the model_info parameter. AutoGen works this way by design - your model name must be included in the ollma/_model_info.py file.

If you're trying to use a custom model like "gpt-4o-mini", you need to explicitly provide the model info like this:

model_client = OllamaChatCompletionClient(
    model="chevalblanc/gpt-40-mini:latest",
    host="http://192.168.1.155:11434/api/chat",
    model_info = {
        "vision": False,
        "function_calling": False,
        "json_output": False,
        "family": "unknown"
        "structured_output": False,
    }
)

You can adjust the capabilities (vision, function_calling, etc.) based on what your model actually supports. If you're unsure about the model's capabilities, starting with all set to False and then adjusting as needed is a reasonable approach.

For reference, here are the models currently defined in the model_info dictionary:

all-minilm, bge-m3, codegemma, codellama, command-r, deepseek-coder, deepseek-coder-v2, 
deepseek-r1, dolphin-llama3, dolphin-mistral, dolphin-mixtral, gemma, gemma2, llama2, 
llama2-uncensored, llama3, llama3.1, llama3.2, llama3.2-vision, llama3.3, llava, 
llava-llama3, mistral, mistral-nemo, mixtral, mxbai-embed-large, nomic-embed-text, 
orca-mini, phi, phi3, phi3.5, phi4, qwen, qwen2, qwen2.5, qwen2.5-coder, 
snowflake-arctic-embed, starcoder2, tinyllama, wizardlm2, yi, zephyr

Hope this helps!

If you could solved it, with that direction, please close that issue.

Mar 31 '25 14:03 SongChiYoung

非常感谢！thanks！ I have two more questions that I hope to receive help with code start

import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_core import CancellationToken
from autogen_ext.tools.mcp import StdioServerParams, mcp_server_tools
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.models.ollama import OllamaChatCompletionClient


from dotenv import load_dotenv
import autogen
import os
# 加载环境变量
load_dotenv()

llm_config={"config_list": [
    {
        "model": "deepseek-chat",
        "api_key": os.getenv("DEEPSEEK_API_KEY"),
        "base_url": os.getenv("DEEPSEEK_BASE_URL"),
        "tags":["花钱的"]
    }]}



async def main():
    # Setup the MCP fetch server parameters
    fetch_mcp_server = StdioServerParams(command="node", args=["E:/ruanjian/Trae/work/fetch-mcp-server/fetch-mcp-main/dist/index.js"])
    
    # Get the fetch tool from the MCP server
    tools = await mcp_server_tools(fetch_mcp_server)
      
    #model_client = OpenAIChatCompletionClient(model="gpt-4o")
    model_client = OllamaChatCompletionClient(
        model="qwen2.5:72b",
        host="http://127.0.0.1:11435/v1/",
        model_info = {  #不是openai的模型需要配置这个
            "vision": False,
            "function_calling": True,
            "json_output": True,
            "family": "unknown",
            "structured_output": True,
        }
    )
    print(model_client)
    fetch_agent = AssistantAgent(
        name="content_fetcher",
        system_message="你是一个网页内容获取助手。使用fetch工具获取网页内容。",
        model_client=model_client,
        
        tools=tools
    )
    
    # Create rewriter Agent (unchanged)
    rewriter_agent = AssistantAgent(
        name="content_rewriter",
        system_message="""你是一个内容改写专家。将提供给你的网页内容改写为科技资讯风格的文章。
        科技资讯风格特点：
        1. 标题简洁醒目
        2. 开头直接点明主题
        3. 内容客观准确但生动有趣
        4. 使用专业术语但解释清晰
        5. 段落简短，重点突出
        
        当你完成改写后，回复TERMINATE。""",
        model_client=model_client
    )
    
    # Set up termination condition and team (unchanged)
    termination = TextMentionTermination("TERMINATE")
    team = RoundRobinGroupChat([fetch_agent, rewriter_agent], termination_condition=termination)
    
    # Run the workflow (unchanged)
    result = await team.run(
        task="获取https://www.aivi.fyi/llms/introduce-Claude-3.7-Sonnet的内容，然后将其改写为科技资讯风格的文章",
        cancellation_token=CancellationToken()
    )
    
    print("\n最终改写结果：\n")
    print(result.messages[-1].content)
    return result

# This is the correct way to run async code in a Python script
if __name__ == "__main__":
    asyncio.run(main())

code end

I can execute the above code normally using gpt4, but switching to something else like qwen2.5:72b will result in the following error

error start

PS E:\ruanjian\Trae\work> & D:/work/python3/python.exe e:/ruanjian/Trae/work/autogen_agents/mcp_autogen_1.py
<autogen_ext.models.ollama._ollama_client.OllamaChatCompletionClient object at 0x000002978713A6C0>
Error processing publish message for content_fetcher_64afb88b-2c0f-4ef6-a884-0446fc1db0e7/64afb88b-2c0f-4ef6-a884-0446fc1db0e7
Traceback (most recent call last):
  File "D:\work\python3\Lib\site-packages\autogen_core\_single_threaded_agent_runtime.py", line 510, in _on_message
    return await agent.on_message(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_core\_base_agent.py", line 113, in on_message
    return await self.on_message_impl(message, ctx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\teams\_group_chat\_sequential_routed_agent.py", line 67, in on_message_impl
    return await super().on_message_impl(message, ctx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_core\_routed_agent.py", line 485, in on_message_impl
    return await h(self, message, ctx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_core\_routed_agent.py", line 268, in wrapper
    return_value = await func(self, message, ctx)  # type: ignore
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\teams\_group_chat\_chat_agent_container.py", line 69, in handle_request
    async for msg in self._agent.on_messages_stream(self._message_buffer, ctx.cancellation_token):
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\agents\_assistant_agent.py", line 748, in on_messages_stream
    async for inference_output in self._call_llm(
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\agents\_assistant_agent.py", line 870, in _call_llm
    model_result = await model_client.create(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_ext\models\ollama\_ollama_client.py", line 477, in create
    result: ChatResponse = await future
                           ^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\ollama\_client.py", line 837, in chat
    return await self._request(
           ^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\ollama\_client.py", line 682, in _request
    return cls(**(await self._request_raw(*args, **kwargs)).json())
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\ollama\_client.py", line 626, in _request_raw
    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: 404 page not found (status code: 404)
Traceback (most recent call last):
  File "e:\ruanjian\Trae\work\autogen_agents\mcp_autogen_1.py", line 88, in <module>
    asyncio.run(main())
  File "D:\work\python3\Lib\asyncio\runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\asyncio\base_events.py", line 684, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "e:\ruanjian\Trae\work\autogen_agents\mcp_autogen_1.py", line 77, in main
    result = await team.run(
             ^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\teams\_group_chat\_base_group_chat.py", line 277, in run
    async for message in self.run_stream(
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\teams\_group_chat\_base_group_chat.py", line 482, in run_stream        
    await shutdown_task
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\teams\_group_chat\_base_group_chat.py", line 426, in stop_runtime      
    await self._runtime.stop_when_idle()
  File "D:\work\python3\Lib\site-packages\autogen_core\_single_threaded_agent_runtime.py", line 746, in stop_when_idle
    await self._run_context.stop_when_idle()
  File "D:\work\python3\Lib\site-packages\autogen_core\_single_threaded_agent_runtime.py", line 120, in stop_when_idle
    await self._run_task
  File "D:\work\python3\Lib\site-packages\autogen_core\_single_threaded_agent_runtime.py", line 109, in _run
    await self._runtime._process_next()  # type: ignore
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_core\_single_threaded_agent_runtime.py", line 581, in _process_next
    raise e from None
  File "D:\work\python3\Lib\site-packages\autogen_core\_single_threaded_agent_runtime.py", line 528, in _process_publish
    await asyncio.gather(*responses)
  File "D:\work\python3\Lib\site-packages\autogen_core\_single_threaded_agent_runtime.py", line 523, in _on_message
    raise e
  File "D:\work\python3\Lib\site-packages\autogen_core\_single_threaded_agent_runtime.py", line 510, in _on_message
    return await agent.on_message(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_core\_base_agent.py", line 113, in on_message
    return await self.on_message_impl(message, ctx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\teams\_group_chat\_sequential_routed_agent.py", line 67, in on_message_impl
    return await super().on_message_impl(message, ctx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_core\_routed_agent.py", line 485, in on_message_impl
    return await h(self, message, ctx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_core\_routed_agent.py", line 268, in wrapper
    return_value = await func(self, message, ctx)  # type: ignore
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\teams\_group_chat\_chat_agent_container.py", line 69, in handle_request
    async for msg in self._agent.on_messages_stream(self._message_buffer, ctx.cancellation_token):
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\agents\_assistant_agent.py", line 748, in on_messages_stream
    async for inference_output in self._call_llm(
  File "D:\work\python3\Lib\site-packages\autogen_agentchat\agents\_assistant_agent.py", line 870, in _call_llm
    model_result = await model_client.create(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\autogen_ext\models\ollama\_ollama_client.py", line 477, in create
    result: ChatResponse = await future
                           ^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\ollama\_client.py", line 837, in chat
    return await self._request(
           ^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\ollama\_client.py", line 682, in _request
    return cls(**(await self._request_raw(*args, **kwargs)).json())
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\work\python3\Lib\site-packages\ollama\_client.py", line 626, in _request_raw
    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: 404 page not found (status code: 404)

error end

Apr 01 '25 01:04 xsw1006931693

Can you debug? What is the request got sent out to the Ollama server? Looks like you are getting 404 -- is the model running?

Apr 01 '25 03:04 ekzhu

is running

Apr 01 '25 03:04 xsw1006931693

and mcp server is running

Apr 01 '25 03:04 xsw1006931693

For OllamaChatCompletionClient, you can just do this:

model_client = OllamaChatCompletionClient(
        model="qwen2.5:72b",
)

I think there is some issue with your URL.

Also, I edited your comment to add

```python
```

around code blocks.

Apr 01 '25 04:04 ekzhu

fetch-mcp-server.zip My ollama is not local, but has been configured for port forwarding access to another server. This configuration can only access the local server, and as shown in the diagram, it is connected with output. And the error is not that the model cannot connect, but that the MCP call cannot be found. I used the GPT-4O model, and I suspect it returned an unexpected result, so I cannot successfully call the MCP service. My code is complete, you can try it out,

Apr 01 '25 04:04 xsw1006931693

@xsw1006931693

Let's check step by step for find real issue in there.

Run it with AutoGen without MCP
Run it with AutoGen with MCP too

and, first of all, OllamaChatCompletionClient in AutoGen, maybe using ollama SDK. and.... howabout, change base_url .../v1 -> /

because, It's just host.

What about checking right works api is that. and OllamaChatCompletionClient is using ollama SDK, so that endpoint is works too.

import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_core import CancellationToken
from autogen_ext.tools.mcp import StdioServerParams, mcp_server_tools
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.models.ollama import OllamaChatCompletionClient
from autogen_core.models import UserMessage


client = OllamaChatCompletionClient(
    model="gemma2:2b",
    host="http://localhost:11435",
    api_key="ollama",
    model_info={
        "vision": False,
        "function_calling": True,
        "json_output": True,
        "family": "unknown",
    },
)

messages = [
    UserMessage(content="hello", source="user"),
]

print(asyncio.run(client.create(messages=messages)))

here is my testcase

When I using same as you, I could find same error as that.

client = OllamaChatCompletionClient(
    model="llama3.1:latest",
    host="127.0.0.1:11435/v1",
    api_key="ollama",
)

messages = [
    UserMessage(content="hello", source="user"),
]

print(asyncio.run(client.create(messages=messages)))

Error

    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: 404 page not found (status code: 404)

Apr 01 '25 05:04 SongChiYoung

And, at the AutoGen... With using wrong URL, does not makes error before really run it.

Case

client = OllamaChatCompletionClient(
    model="llama3.1:latest",
    host="121.212.1212:1212/",
    api_key="ollama",
)

messages = [
    UserMessage(content="hello", source="user"),
]

# print(asyncio.run(client.create(messages=messages)))
print(client)
print("DONE IT")

Result, With out any error.

(python) (base) ➜  my-app python t.py
<autogen_ext.models.ollama._ollama_client.OllamaChatCompletionClient object at 0x107e7bb10>
DONE IT

Apr 01 '25 05:04 SongChiYoung

import asyncio from autogen_agentchat.agents import AssistantAgent from autogen_agentchat.teams import RoundRobinGroupChat from autogen_agentchat.conditions import TextMentionTermination from autogen_core import CancellationToken from autogen_ext.tools.mcp import StdioServerParams, mcp_server_tools from autogen_ext.models.openai import OpenAIChatCompletionClient from autogen_ext.models.ollama import OllamaChatCompletionClient from autogen_core.models import UserMessage

client = OllamaChatCompletionClient( model="llama3:8b", host="http://localhost:11435", api_key="ollama", model_info={ "vision": False, "function_calling": True, "json_output": True, "family": "unknown", }, )

messages = [ UserMessage(content="hello", source="user"), ]

print(asyncio.run(client.create(messages=messages)))

Apr 01 '25 05:04 xsw1006931693

Umm why, at the code model name is llama3:8b however at the error code 7lama3:8b..? Please check it.

Apr 01 '25 05:04 SongChiYoung

7lama3:8b I wrote the wrong name during the first execution, just ignore it

Apr 01 '25 05:04 xsw1006931693

At present, it is not a problem to execute the test separately without calling MCP

Apr 01 '25 05:04 xsw1006931693

Cool, how about with MCP? However I do not know about MCP a lot. If you have other issue in there, I could not help you.

Apr 01 '25 06:04 SongChiYoung

MCP cannot be used. Regarding this issue, some models may encounter errors when calling MCP. Who should I consult and what should I do?

Apr 01 '25 06:04 xsw1006931693

import asyncio from autogen_agentchat.agents import UserProxyAgent from autogen_agentchat.conditions import TextMentionTermination from autogen_agentchat.teams import RoundRobinGroupChat from autogen_agentchat.ui import Console from autogen_ext.models.ollama import OllamaChatCompletionClient from autogen_ext.models.openai import OpenAIChatCompletionClient from autogen_ext.agents.web_surfer import MultimodalWebSurfer

async def main() -> None: #model_client = OpenAIChatCompletionClient(model="gpt-4o")

model_client = OllamaChatCompletionClient(
    model="qwen2.5:72b",
    host="http://localhost:11435",
    api_key="ollama",
    model_info={
        "vision": False,
        "function_calling": True,
        "json_output": True,
        "family": "unknown",
    },
)
# The web surfer will open a Chromium browser window to perform web browsing tasks.
web_surfer = MultimodalWebSurfer("web_surfer", model_client, headless=False, animate_actions=True)
# The user proxy agent is used to get user input after each step of the web surfer.
# NOTE: you can skip input by pressing Enter.
user_proxy = UserProxyAgent("user_proxy")
# The termination condition is set to end the conversation when the user types 'exit'.
termination = TextMentionTermination("exit", sources=["user_proxy"])
# Web surfer and user proxy take turns in a round-robin fashion.
team = RoundRobinGroupChat([web_surfer, user_proxy], termination_condition=termination)
try:
    # Start the team and wait for it to terminate.
    await Console(team.run_stream(task="Find information about AutoGen and write a short summary."))
finally:
    await web_surfer.close()
    await model_client.close()

asyncio.run(main())

Can you please help me check this issue? It won't work if we switch to Olama, and I found that the parameter formats of OpenAIChatCompletionClient and OllamaChatCompletionClient are also different, which may be the reason for the MCP error

Apr 01 '25 06:04 xsw1006931693

client1 = OllamaChatCompletionClient(
    model="llama3.1:latest",
    host="127.0.0.1:11435",
    api_key="ollama",
)

client2 = OpenAIChatCompletionClient(
    model="llama3.1:latest",
    base_url="http://127.0.0.1:11435/v1",
    api_key="ollama",
    model_info={
        "vision": False,
        "function_calling": True,
        "json_output": True,
        "family": "unknown",
    },
)

messages = [
    UserMessage(content="hello", source="user"),
]

print("OLLAMA SDK : ", asyncio.run(client1.create(messages=messages)))
print("OPENAI SDK : ", asyncio.run(client2.create(messages=messages)))
print(client2)
print("DONE IT")

Here is code snippet for OpenAI SDK aware version of ollama

Apr 01 '25 06:04 SongChiYoung

If you could solving your whole isssue. Please close that issue. Cause it is not the bug.

Apr 01 '25 08:04 SongChiYoung

What I mainly want to ask is why calling MCP works with GPT4O, but it doesn't work with the Olama model and reports errors. This still hasn't been resolved. So I feel like there may still be bugs

Apr 01 '25 08:04 xsw1006931693

How about other function call (tool) is work? without MCP?

Apr 01 '25 12:04 SongChiYoung

import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_core import CancellationToken
from autogen_ext.tools.mcp import StdioServerParams, mcp_server_tools
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.models.ollama import OllamaChatCompletionClient

import os
# 加载环境变量



async def main():
    # Setup the MCP fetch server parameters
    fetch_mcp_server = StdioServerParams(command="node", args=["./fetch-mcp-main/dist/index.js"])
    
    # Get the fetch tool from the MCP server
    tools = await mcp_server_tools(fetch_mcp_server)
      
    #model_client = OpenAIChatCompletionClient(model="gpt-4o")
    model_client = OllamaChatCompletionClient(
        model="llama3.1:latest",
        host="127.0.0.1:11435",
        model_info = {  #不是openai的模型需要配置这个
            "vision": False,
            "function_calling": True,
            "json_output": True,
            "family": "unknown",
            "structured_output": True,
        }
    )
    print(model_client)
    fetch_agent = AssistantAgent(
        name="content_fetcher",
        system_message="你是一个网页内容获取助手。使用fetch工具获取网页内容。",
        model_client=model_client,
        
        tools=tools
    )
    
    # Create rewriter Agent (unchanged)
    rewriter_agent = AssistantAgent(
        name="content_rewriter",
        system_message="""你是一个内容改写专家。将提供给你的网页内容改写为科技资讯风格的文章。
        科技资讯风格特点：
        1. 标题简洁醒目
        2. 开头直接点明主题
        3. 内容客观准确但生动有趣
        4. 使用专业术语但解释清晰
        5. 段落简短，重点突出
        
        当你完成改写后，回复TERMINATE。""",
        model_client=model_client
    )
    
    # Set up termination condition and team (unchanged)
    termination = TextMentionTermination("TERMINATE")
    team = RoundRobinGroupChat([fetch_agent, rewriter_agent], termination_condition=termination)
    
    # Run the workflow (unchanged)
    result = await team.run(
        task="获取https://www.aivi.fyi/llms/introduce-Claude-3.7-Sonnet的内容，然后将其改写为科技资讯风格的文章",
        cancellation_token=CancellationToken()
    )
    
    print("\n最终改写结果：\n")
    print(result.messages[-1].content)
    return result

# This is the correct way to run async code in a Python script
if __name__ == "__main__":
    asyncio.run(main())

Is it work in my pc. haha...

Apr 01 '25 14:04 SongChiYoung

Hi, my attempt is like this。Firstly, I ran the code you gave me.I also did not report the previous error and successfully initialized MCP. But there was a prompt, followed by an error message stating that the redirection can be done for port requests. As I am a server forwarding ports, it failed. Currently, it seems that you do not support this and the overall situation. Next, I would like to try changing some models and using this code to see why I couldn't do it before

Apr 02 '25 01:04 xsw1006931693

However, when I ran the same code for the second and third times, I got different results

Apr 02 '25 01:04 xsw1006931693

@xsw1006931693 Hi, I was wondering... could this issue possibly be related to #6198?

I couldn’t dig deeper into that one due to the lack of a reproducible code snippet, but while working on this issue, it came to mind so I wanted to leave a quick comment.

As far as I can tell, the issue in #6198 seems to be addressed in #6284, though it hasn’t been merged yet.

Would appreciate it if you could take a look when you get a chance. :)

Apr 12 '25 04:04 SongChiYoung