autogen icon indicating copy to clipboard operation
autogen copied to clipboard

Ollama Client (with tool calling)

Open marklysze opened this issue 1 year ago • 30 comments

An Ollama client! Run your local models with AutoGen using a dedicated client class.

One of the key features of this library (and which is still very much experimental) is the support for tool calling. This is done "manually" by injecting tools into the prompt and translating between AutoGen's tool call objects and text messages (updated to also support Ollama's native tool calling). The use of tool calling will be described in further detail below but essentially you should be able to get up and running with it as it stands without customising the text injected to support it.

This manual tool calling approach and the actual text injected is an initial attempt to handle tool calling so if you can help improve it, please do!

I'll use the client with some notebooks and local models and summarise the results in another comment.

To run the code, you'll need to install the ollama and fix-busted-json packages (these will be automatically installed when this is merged and you install through pip install pyautogen[ollama]): pip install ollama pip install fix-busted-json

2024-07-27: Updated to include Ollama's native tool calling (just released, v0.3.0 Ollama library)

Related issue numbers

#2893

Checks

  • [X] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
  • [X] I've added tests (if relevant) corresponding to the changes introduced in this PR.
  • [X] I've made sure all auto checks have passed.

marklysze avatar Jul 01 '24 20:07 marklysze

Basic program:

# THIS TESTS: TWO AGENTS WITH TERMINATION

altmodel_llm_config = {
    "config_list":
    [
        {
            "api_type": "ollama",
            "model": "llama3:8b-instruct-q6_K",
            "client_host": "http://192.168.0.1:11434",
            "seed": 42
        }
    ]
}

from autogen import ConversableAgent

jack = ConversableAgent(
    "Jack",
    llm_config=altmodel_llm_config,
    system_message="Your name is Jack and you are a comedian in a two-person comedy show.",
    is_termination_msg=lambda x: True if "FINISH" in x["content"] else False
)
emma = ConversableAgent(
    "Emma",
    llm_config=altmodel_llm_config,
    system_message="Your name is Emma and you are a comedian in two-person comedy show. Say the word FINISH ONLY AFTER you've heard 2 of Jack's jokes.",
    is_termination_msg=lambda x: True if "FINISH" in x["content"] else False
)

chat_result = jack.initiate_chat(emma, message="Emma, tell me a joke about goldfish and peanut butter.", max_turns=10)

Add "stream": True to the config to use streaming.

marklysze avatar Jul 01 '24 20:07 marklysze

Tool calling:

import autogen
from typing import Literal
from typing_extensions import Annotated

# THIS TESTS: TOOL CALLING

altmodel_llm_config = {
    "config_list":
    [
        {
            "api_type": "ollama",
            "model": "llama3:8b-instruct-q6_K",
            "client_host": "http://192.168.0.1:11434",
            "seed": 43,
            "cache_seed": None
        }
    ]
}

# Create the agent and include examples of the function calling JSON in the prompt
# to help guide the model
chatbot = autogen.AssistantAgent(
    name="chatbot",
    system_message="For currency exchange tasks, "
        "only use the functions you have been provided with.",
    llm_config=altmodel_llm_config,
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
)

CurrencySymbol = Literal["USD", "EUR"]

# Define our function that we expect to call
def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
    if base_currency == quote_currency:
        return 1.0
    elif base_currency == "USD" and quote_currency == "EUR":
        return 1 / 1.1
    elif base_currency == "EUR" and quote_currency == "USD":
        return 1.1
    else:
        raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")

# Register the function with the agent
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
    base_amount: Annotated[float, "Amount of currency in base_currency"],
    base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
    quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
    quote_amount = exchange_rate(base_currency, quote_currency) * base_amount
    return f"{format(quote_amount, '.2f')} {quote_currency}"

# start the conversation
res = user_proxy.initiate_chat(
    chatbot,
    message="How much is 123.45 EUR in USD?",
    summary_method="reflection_with_llm",
)

print(f"SUMMARY: {res.summary['content']}")

and result:

user_proxy (to chatbot):

How much is 123.45 EUR in USD?

--------------------------------------------------------------------------------
chatbot (to user_proxy):


***** Suggested tool call (ollama_func_3384): currency_calculator *****
Arguments: 
{"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"}
***********************************************************************

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING FUNCTION currency_calculator...
user_proxy (to chatbot):

user_proxy (to chatbot):

***** Response from calling tool (ollama_func_3384) *****
135.80 USD
*********************************************************

--------------------------------------------------------------------------------
chatbot (to user_proxy):

The result is 135.80 USD.

--------------------------------------------------------------------------------
SUMMARY: 123.45 EUR is equivalent to 135.80 USD.

marklysze avatar Jul 01 '24 20:07 marklysze

Parallel tool calling (LLM recommends multiple tool calls at a time):

import os
import autogen
import json
from typing import Literal
from typing_extensions import Annotated

# THIS TESTS: PARALLEL TOOL CALLING

altmodel_llm_config = {
    "config_list":
    [
        {
            "api_type": "ollama",
            "model": "llama3:8b-instruct-q6_K",
            "client_host": "http://192.168.0.1:11434",
            "seed": 43,
            "cache_seed": None,
            "hide_tools": "if_all_run"
        }
    ]
}

# Create the agent and include examples of the function calling JSON in the prompt
# to help guide the model
chatbot = autogen.AssistantAgent(
    name="chatbot",
    system_message="For currency exchange and weather forecasting tasks, "
        "only use the functions you have been provided with.",
    llm_config=altmodel_llm_config,
)


user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
)

# Currency Exchange function

CurrencySymbol = Literal["USD", "EUR"]

# Define our function that we expect to call
def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
    if base_currency == quote_currency:
        return 1.0
    elif base_currency == "USD" and quote_currency == "EUR":
        return 1 / 1.1
    elif base_currency == "EUR" and quote_currency == "USD":
        return 1.1
    else:
        raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")

# Register the function with the agent
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
    base_amount: Annotated[float, "Amount of currency in base_currency"],
    base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
    quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
    quote_amount = exchange_rate(base_currency, quote_currency) * base_amount
    return f"{format(quote_amount, '.2f')} {quote_currency}"


# Weather function

# Example function to make available to model
def get_current_weather(location, unit="fahrenheit"):
    """Get the weather for some location"""
    if "chicago" in location.lower():
        return json.dumps({"location": "Chicago", "temperature": "13", "unit": unit})
    elif "san francisco" in location.lower():
        return json.dumps({"location": "San Francisco", "temperature": "55", "unit": unit})
    elif "new york" in location.lower():
        return json.dumps({"location": "New York", "temperature": "11", "unit": unit})
    else:
        return json.dumps({"location": location, "temperature": "unknown"})

# Register the function with the agent
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Weather forecast for US cities.")
def weather_forecast(
    location: Annotated[str, "City name"],
) -> str:
    weather_details = get_current_weather(location=location)
    weather = json.loads(weather_details)
    return f"{weather['location']} will be {weather['temperature']} degrees {weather['unit']}"

# start the conversation
res = user_proxy.initiate_chat(
    chatbot,
    message="What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday?",
    summary_method="reflection_with_llm",
)

print(f"SUMMARY: {res.summary['content']}")

and result:

user_proxy (to chatbot):

What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday?

--------------------------------------------------------------------------------
chatbot (to user_proxy):


***** Suggested tool call (ollama_func_5948): weather_forecast *****
Arguments: 
{"location": "New York"}
********************************************************************
***** Suggested tool call (ollama_func_5949): currency_calculator *****
Arguments: 
{"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"}
***********************************************************************

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING FUNCTION weather_forecast...

>>>>>>>> EXECUTING FUNCTION currency_calculator...
user_proxy (to chatbot):

user_proxy (to chatbot):

***** Response from calling tool (ollama_func_5948) *****
New York will be 11 degrees fahrenheit
*********************************************************

--------------------------------------------------------------------------------
user_proxy (to chatbot):

***** Response from calling tool (ollama_func_5949) *****
135.80 USD
*********************************************************

--------------------------------------------------------------------------------
chatbot (to user_proxy):

It will be 11 degrees Fahrenheit in New York and $135.80 is the equivalent of €123.45 in USD, making it a suitable amount to spend on your holiday.

--------------------------------------------------------------------------------
SUMMARY: New York will be 11 degrees Fahrenheit. €123.45 is equivalent to $135.80 in USD.

marklysze avatar Jul 01 '24 20:07 marklysze

thanks for the contribution @marklysze , looks great!!

Hk669 avatar Jul 02 '24 05:07 Hk669

Codecov Report

Attention: Patch coverage is 4.94297% with 250 lines in your changes missing coverage. Please review.

Project coverage is 29.68%. Comparing base (38cce47) to head (00d2a58). Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
autogen/oai/ollama.py 3.60% 241 Missing :warning:
autogen/oai/client.py 40.00% 5 Missing and 1 partial :warning:
autogen/logger/file_logger.py 0.00% 1 Missing :warning:
autogen/logger/sqlite_logger.py 0.00% 1 Missing :warning:
autogen/runtime_logging.py 0.00% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3056      +/-   ##
==========================================
- Coverage   30.49%   29.68%   -0.82%     
==========================================
  Files         113      115       +2     
  Lines       12284    12725     +441     
  Branches     2602     2709     +107     
==========================================
+ Hits         3746     3777      +31     
- Misses       8210     8599     +389     
- Partials      328      349      +21     
Flag Coverage Δ
unittests 29.68% <4.94%> (-0.80%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jul 02 '24 09:07 codecov-commenter

as discussed in Discord with @marklysze, there are some benefits to separating the 'tool calling' from the client itself (ie if we have multiple clients that lack tool calling in their API, the same code would work for both, and to avoid duplicate code and having to update it in multiple places, it makes more sense to use a Capability. SImilarly for Message Cleanup (ie adjusting the messages given to a agent), the variety of tweaks needed depends greatly on both the client and the model, often it's related to the model's prompt template (which is beyond the scope of discussion here, except that it's fairly invisible to most devs until it errors), so putting this code into a Capability also makes more sense.

I went down this road, and realized the above was a better answer, and kudos to Mark for his work overall. Let's adjust this so that nobody writing a client in the future needs to rewrite the above pieces, and devs can add a few lines of code when they get an error, or need tool calling due to a model or client.

I'll try and get my code organized (planned on it last week, but was dealing with some illness) and get things posted as PR, and we can compare and see what makes the most sense to move forward with....

scruffynerf avatar Jul 02 '24 13:07 scruffynerf

Ollama client in c# example needed

GeorgeS2019 avatar Jul 02 '24 23:07 GeorgeS2019

Ollama client in c# example needed

@GeorgeS2019, is that something you would be able to assist with?

marklysze avatar Jul 02 '24 23:07 marklysze

may i suggest improving this ollama object with test containers ollama object ? it's quite nice when everything is containerized, that's why. for coding you can do everything in code which is also a nice thing.

if this doesnt fit, i'll take notes on how you do it , and maybe try my hand at providing this as an alternative.

Josephrp avatar Jul 04 '24 18:07 Josephrp

may i suggest improving this ollama object with test containers ollama object ? it's quite nice when everything is containerized, that's why. for coding you can do everything in code which is also a nice thing.

if this doesnt fit, i'll take notes on how you do it , and maybe try my hand at providing this as an alternative.

Hi @Josephrp, sounds like it's worth looking in to. Is there a reference to what you're thinking? I'm not sure what that involves.

marklysze avatar Jul 04 '24 20:07 marklysze

https://testcontainers.com/modules/ollama/

this is the object

this is the pull request for python : https://github.com/testcontainers/testcontainers-python/pull/618/files

but in general i really like this "ephemeral" way to make containers for this kind of purpose

Josephrp avatar Jul 05 '24 06:07 Josephrp

hey there , hope i didnt take the wind out of your sails, i'm keen to look into this together if you want :-)

Josephrp avatar Jul 09 '24 23:07 Josephrp

hey there , hope i didnt take the wind out of your sails, i'm keen to look into this together if you want :-)

Hey @Josephrp, sorry I'm currently out with the flu :(

I think it's a good topic to investigate. My focus is on having local LLMs available within AutoGen, so don't worry about taking the wind out of my sails :).

Perhaps a proof of concept that others can provide feedback on. Not sure if that's tricky but may be good for others to weigh in and also good to know if it's ollama specific or could be used for other client classes.

Sorry I haven't provided feedback on it.

marklysze avatar Jul 10 '24 02:07 marklysze

hey get better soon okay :-)

Josephrp avatar Jul 10 '24 12:07 Josephrp

When you get ready to publish samples, recommend using ("http://localhost:11434") or 127.0.0.1:11434 instead of "http://192.168.0.1:11434",

smartdawg avatar Jul 15 '24 21:07 smartdawg

Hi @marklysze! I tried your second example (tool calling with currency exchange). Here is my output. It seems that it can call the functions, but there is no "chatbot (to user_proxy): The result is 135.80 USD." And there is a warning message. May I kindly ask how to fix it? Thanks a lot.

Output

user_proxy (to chatbot):

How much is 123.45 EUR in USD?

[autogen.oai.client: 07-20 15:50:54] {329} WARNING - Model ollama/llama3 is not found. The cost will be 0. In your config_list, add field {"price" : [prompt_price_per_1k, completion_token_price_per_1k]} for customized pricing.
chatbot (to user_proxy):

***** Suggested tool call (call_2e7ece9b-6fb2-4871-b503-5f91bfefd58e): currency_calculator *****
Arguments: 
{"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"}
************************************************************************************************

Provide feedback to chatbot. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: 

>>>>>>>> NO HUMAN INPUT RECEIVED.

>>>>>>>> USING AUTO REPLY...

>>>>>>>> EXECUTING FUNCTION currency_calculator...
user_proxy (to chatbot):

user_proxy (to chatbot):

***** Response from calling tool (call_2e7ece9b-6fb2-4871-b503-5f91bfefd58e) *****
135.80 USD
**********************************************************************************

[autogen.oai.client: 07-20 15:51:01] {329} WARNING - Model ollama/llama3 is not found. The cost will be 0. In your config_list, add field {"price" : [prompt_price_per_1k, completion_token_price_per_1k]} for customized pricing.
chatbot (to user_proxy):

***** Suggested tool call (call_0c00a35f-9be7-41da-a57f-b5599049ab3b): currency_calculator *****
Arguments: 
{"base_amount": 135.8, "base_currency": "USD", "quote_currency": "EUR"}
************************************************************************************************

--------------------------------------------------------------------------------
Provide feedback to chatbot. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: print(f"SUMMARY: {res.summary['content']}")
user_proxy (to chatbot):

Here is my code:


import autogen
from typing import Literal
from typing_extensions import Annotated



config_list = [
    {
        "api_type": "ollama",
        "model": "llama3",
        "base_url": "http://127.0.0.1:4000",
        "seed": 43,
        "cache_seed": None
    }
]

llm_config = {"config_list": config_list}


chatbot = autogen.AssistantAgent(
    name="chatbot",
    system_message="For currency exchange tasks, "
        "only use the functions you have been provided with.",
    llm_config=llm_config,
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
)

CurrencySymbol = Literal["USD", "EUR"]


def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
    if base_currency == quote_currency:
        return 1.0
    elif base_currency == "USD" and quote_currency == "EUR":
        return 1 / 1.1
    elif base_currency == "EUR" and quote_currency == "USD":
        return 1.1
    else:
        raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")


@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
    base_amount: Annotated[float, "Amount of currency in base_currency"],
    base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
    quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
    quote_amount = exchange_rate(base_currency, quote_currency) * float(base_amount)
    return f"{format(quote_amount, '.2f')} {quote_currency}"


res = user_proxy.initiate_chat(
    chatbot,
    message="How much is 123.45 EUR in USD?",
    summary_method="reflection_with_llm",
)

print(f"SUMMARY: {res.summary['content']}")

yilinwu123 avatar Jul 20 '24 07:07 yilinwu123

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
10404695 Triggered Generic High Entropy Secret 49f4c192262a38fda23f8e94362e26e4175434af test/oai/test_utils.py View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

gitguardian[bot] avatar Jul 20 '24 21:07 gitguardian[bot]

Trying to run the autogen deeplearning.ai online course with ollama through this branch.

Ran into some issues with lesson 2 Sequential Chats and Customer Onboarding which I believe indicates an issue with the ollama client in the current form.

Stacktrace on error: Traceback (most recent call last): File "/workspaces/multi-agent-autogen-experiments/lesson2_local.py", line 106, in <module> chat_results = initiate_chats(chats) ^^^^^^^^^^^^^^^^^^^^^ File "/home/autogen/autogen/autogen/agentchat/chat.py", line 199, in initiate_chats __post_carryover_processing(chat_info) File "/home/autogen/autogen/autogen/agentchat/chat.py", line 119, in __post_carryover_processing ("\n").join([t for t in chat_info["carryover"]]) TypeError: sequence item 0: expected str instance, dict found

https://github.com/microsoft/autogen/blob/ollamaclient/autogen/agentchat/chat.py#L199 is where the issue occurs

Unclear to me exactly what is causing this, but the same code works if the llm_config points to an openai models.

Code for reproducing

import pprint
import os

#Load .env with OPENAI_API_KEY
from dotenv import load_dotenv
load_dotenv()

llm_config = {
    "config_list":
    [
        {
            "api_type": "ollama",
            "model": "llama3:8b",
            "client_host": "http://host.docker.internal:11434",
            "seed": 42,
            "price": [0.0,0.0]
        }
    ]
}

#The example below works with openai llm_config
#llm_config = {"model": "gpt-4o-mini", "api_key": os.environ["OPENAI_API_KEY"]}


from autogen import ConversableAgent

onboarding_personal_information_agent = ConversableAgent(
    name="Onboarding Personal Information Agent",
    system_message='''You are a helpful customer onboarding agent,
    you are here to help new customers get started with our product.
    Your job is to gather customer's name and location.
    Do not ask for other information. Return 'TERMINATE' 
    when you have gathered all the information.''',
    llm_config=llm_config,
    code_execution_config=False,
    human_input_mode="NEVER",
)

onboarding_topic_preference_agent = ConversableAgent(
    name="Onboarding Topic preference Agent",
    system_message='''You are a helpful customer onboarding agent,
    you are here to help new customers get started with our product.
    Your job is to gather customer's preferences on news topics.
    Do not ask for other information.
    Return 'TERMINATE' when you have gathered all the information.''',
    llm_config=llm_config,
    code_execution_config=False,
    human_input_mode="NEVER",
)

customer_engagement_agent = ConversableAgent(
    name="Customer Engagement Agent",
    system_message='''You are a helpful customer service agent
    here to provide fun for the customer based on the user's
    personal information and topic preferences.
    This could include fun facts, jokes, or interesting stories.
    Make sure to make it engaging and fun!
    Return 'TERMINATE' when you are done.''',
    llm_config=llm_config,
    code_execution_config=False,
    human_input_mode="NEVER",
    is_termination_msg=lambda msg: "terminate" in msg.get("content").lower(),
)

customer_proxy_agent = ConversableAgent(
    name="customer_proxy_agent",
    llm_config=False,
    code_execution_config=False,
    human_input_mode="ALWAYS",
    is_termination_msg=lambda msg: "terminate" in msg.get("content").lower(),
)

chats = [
    {
        "sender": onboarding_personal_information_agent,
        "recipient": customer_proxy_agent,
        "message": 
            "Hello, I'm here to help you get started with our product."
            "Could you tell me your name and location?",
        "summary_method": "reflection_with_llm",
        "summary_args": {
            "summary_prompt" : "Return the customer information "
                             "into as JSON object only: "
                             "{'name': '', 'location': ''}",
        },
        "max_turns": 2,
        "clear_history" : True
    },
    {
        "sender": onboarding_topic_preference_agent,
        "recipient": customer_proxy_agent,
        "message": 
                "Great! Could you tell me what topics you are "
                "interested in reading about?",
        "summary_method": "reflection_with_llm",
        "max_turns": 1,
        "clear_history" : False
    },
    {
        "sender": customer_proxy_agent,
        "recipient": customer_engagement_agent,
        "message": "Let's find something fun to read.",
        "max_turns": 1,
        "summary_method": "reflection_with_llm",
    },
]

from autogen import initiate_chats

chat_results = initiate_chats(chats)

for chat_result in chat_results:
    print(chat_result.summary)
    print("\n")

elsewhat avatar Jul 22 '24 08:07 elsewhat

Hi @marklysze! I tried your second example (tool calling with currency exchange). Here is my output. It seems that it can call the functions, but there is no "chatbot (to user_proxy): The result is 135.80 USD." And there is a warning message. May I kindly ask how to fix it? Thanks a lot.

Hi @yilinwu123, thanks for testing this out...

I ran your code but with my local config:

config_list = [
    {
        "api_type": "ollama",
        "model": "llama3:instruct",
        "client_host": "http://192.168.0.115:11434",
        "seed": 43,
        "cache_seed": None
    }
]

Please note that the ollama client class uses client_host (not 'base_url'). I need to include the :instruct on my model name, but the default is instruct so you may not need to.

When I ran it I did get the summary showing. Can you try with client_host?

marklysze avatar Jul 25 '24 22:07 marklysze

Trying to run the autogen deeplearning.ai online course with ollama through this branch.

Ran into some issues with lesson 2 Sequential Chats and Customer Onboarding which I believe indicates an issue with the ollama client in the current form.

Stacktrace on error: Traceback (most recent call last): File "/workspaces/multi-agent-autogen-experiments/lesson2_local.py", line 106, in <module> chat_results = initiate_chats(chats) ^^^^^^^^^^^^^^^^^^^^^ File "/home/autogen/autogen/autogen/agentchat/chat.py", line 199, in initiate_chats __post_carryover_processing(chat_info) File "/home/autogen/autogen/autogen/agentchat/chat.py", line 119, in __post_carryover_processing ("\n").join([t for t in chat_info["carryover"]]) TypeError: sequence item 0: expected str instance, dict found

https://github.com/microsoft/autogen/blob/ollamaclient/autogen/agentchat/chat.py#L199 is where the issue occurs

Unclear to me exactly what is causing this, but the same code works if the llm_config points to an openai models.

Hi @elsewhat, good to get this code tested... thanks for trying it out against the course material.

I just tried with my config:

llm_config = {
    "config_list":
    [
        {
            "api_type": "ollama",
            "model": "llama3:instruct",
            "client_host": "http://192.168.0.115:11434",
            "seed": 42,
            "price": [0.0,0.0]
        }
    ]
}

And I'm getting through the code:

********************************************************************************
Starting a new chat....

********************************************************************************
Onboarding Personal Information Agent (to customer_proxy_agent):

Hello, I'm here to help you get started with our product.Could you tell me your name and location?

--------------------------------------------------------------------------------
Provide feedback to Onboarding Personal Information Agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: Mark and I'm in New York
customer_proxy_agent (to Onboarding Personal Information Agent):

Mark and I'm in New York

--------------------------------------------------------------------------------
Onboarding Personal Information Agent (to customer_proxy_agent):

So, just to confirm: your name is Mark and you're located in New York?

Please let me know if that's correct before I proceed.

(Note: waiting for confirmation)

--------------------------------------------------------------------------------
Provide feedback to Onboarding Personal Information Agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: Yes
customer_proxy_agent (to Onboarding Personal Information Agent):

Yes

--------------------------------------------------------------------------------

********************************************************************************
Starting a new chat....

********************************************************************************
Onboarding Topic preference Agent (to customer_proxy_agent):

Great! Could you tell me what topics you are interested in reading about?
Context: 
{'name': 'Mark', 'location': 'New York'}

--------------------------------------------------------------------------------
Provide feedback to Onboarding Topic preference Agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: Architecture
customer_proxy_agent (to Onboarding Topic preference Agent):

Architecture

--------------------------------------------------------------------------------

********************************************************************************
Starting a new chat....

********************************************************************************
customer_proxy_agent (to Customer Engagement Agent):

Let's find something fun to read.
Context: 
{'name': 'Mark', 'location': 'New York'}
I'm interested in reading about architecture.

--------------------------------------------------------------------------------
Customer Engagement Agent (to customer_proxy_agent):

Hi Mark! It's great to chat with you about architecture!

As a New Yorker, I'm sure you appreciate the iconic buildings and structures that make up the city's skyline. Did you know that the Empire State Building was the tallest building in the world when it was completed in 1931? It held that title for over 40 years! 

But let's talk about some fun facts about architecture. Did you know that:

* The Guggenheim Museum in New York City has a unique spiral design, which allows visitors to see art from multiple angles?
* The Flatiron Building in Manhattan is one of the most iconic and recognizable buildings in the world? 
* The Brooklyn Bridge, also in NYC, was the longest suspension bridge in the world when it opened in 1883?

Now, let me share a cool story about architecture. Have you heard about the High Line in Chelsea, Manhattan? It's an elevated park built on an old rail line. When the trains stopped running, the city decided to transform the area into a public green space. Today, it's one of the most popular parks in New York City!

I hope this sparks your interest in architecture, Mark! What do you think about these fascinating facts and stories? Would you like to learn more?

(When I'm done, I'll say "TERMINATE"!)

--------------------------------------------------------------------------------
{'content': "{'name': 'Mark', 'location': 'New York'}", 'role': 'assistant', 'function_call': None, 'tool_calls': None}


{'content': "I'm interested in reading about architecture.", 'role': 'assistant', 'function_call': None, 'tool_calls': None}


{'content': "New York-based architecture facts include the Empire State Building's former tallest-in-the-world status, the Guggenheim Museum's unique spiral design, and iconic buildings like the Flatiron Building and Brooklyn Bridge. The High Line in Chelsea is also mentioned as a popular elevated park built on an old rail line.", 'role': 'assistant', 'function_call': None, 'tool_calls': None}

How far are you getting through before the crash?

marklysze avatar Jul 25 '24 22:07 marklysze

Ollama has just released an updated library, version 0.3.0 that includes the support for tool calling! I'll test this out and see whether it's better than what's in place so far.

marklysze avatar Jul 26 '24 01:07 marklysze

Hi @marklysze! I tried your second example (tool calling with currency exchange). Here is my output. It seems that it can call the functions, but there is no "chatbot (to user_proxy): The result is 135.80 USD." And there is a warning message. May I kindly ask how to fix it? Thanks a lot.

Output

user_proxy (to chatbot):

How much is 123.45 EUR in USD?

[autogen.oai.client: 07-20 15:50:54] {329} WARNING - Model ollama/llama3 is not found. The cost will be 0. In your config_list, add field {"price" : [prompt_price_per_1k, completion_token_price_per_1k]} for customized pricing.
chatbot (to user_proxy):

***** Suggested tool call (call_2e7ece9b-6fb2-4871-b503-5f91bfefd58e): currency_calculator *****
Arguments: 
{"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"}
************************************************************************************************

Provide feedback to chatbot. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: 

>>>>>>>> NO HUMAN INPUT RECEIVED.

>>>>>>>> USING AUTO REPLY...

>>>>>>>> EXECUTING FUNCTION currency_calculator...
user_proxy (to chatbot):

user_proxy (to chatbot):

***** Response from calling tool (call_2e7ece9b-6fb2-4871-b503-5f91bfefd58e) *****
135.80 USD
**********************************************************************************

[autogen.oai.client: 07-20 15:51:01] {329} WARNING - Model ollama/llama3 is not found. The cost will be 0. In your config_list, add field {"price" : [prompt_price_per_1k, completion_token_price_per_1k]} for customized pricing.
chatbot (to user_proxy):

***** Suggested tool call (call_0c00a35f-9be7-41da-a57f-b5599049ab3b): currency_calculator *****
Arguments: 
{"base_amount": 135.8, "base_currency": "USD", "quote_currency": "EUR"}
************************************************************************************************

--------------------------------------------------------------------------------
Provide feedback to chatbot. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: print(f"SUMMARY: {res.summary['content']}")
user_proxy (to chatbot):

Here is my code:

import autogen
from typing import Literal
from typing_extensions import Annotated



config_list = [
    {
        "api_type": "ollama",
        "model": "llama3",
        "base_url": "http://127.0.0.1:4000",
        "seed": 43,
        "cache_seed": None
    }
]

llm_config = {"config_list": config_list}


chatbot = autogen.AssistantAgent(
    name="chatbot",
    system_message="For currency exchange tasks, "
        "only use the functions you have been provided with.",
    llm_config=llm_config,
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
)

CurrencySymbol = Literal["USD", "EUR"]


def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
    if base_currency == quote_currency:
        return 1.0
    elif base_currency == "USD" and quote_currency == "EUR":
        return 1 / 1.1
    elif base_currency == "EUR" and quote_currency == "USD":
        return 1.1
    else:
        raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")


@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
    base_amount: Annotated[float, "Amount of currency in base_currency"],
    base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
    quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
    quote_amount = exchange_rate(base_currency, quote_currency) * float(base_amount)
    return f"{format(quote_amount, '.2f')} {quote_currency}"


res = user_proxy.initiate_chat(
    chatbot,
    message="How much is 123.45 EUR in USD?",
    summary_method="reflection_with_llm",
)

print(f"SUMMARY: {res.summary['content']}")

Hi @marklysze ! Thanks a lot for your reply. I change llm_config (two versions with two urls) as below. But there is a new error with API key. May I kindly check do I need to use litellm --model ollama.llama3 when I set "client_host"? The url for litellm is "http://127.0.0.1:4000". And I checked ollama, the local host url is "http://localhost:11434". Thanks a lot for your support.

config_list = [
    {
        "api_type": "ollama",
        "model": "llama3",
        "client_host": "http://127.0.0.1:4000",
        "seed": 43,
        "cache_seed": None
    }
]
config_list = [
    {
        "api_type": "ollama",
        "model": "llama3",
        "client_host": "http://localhost:11434",
        "seed": 43,
        "cache_seed": None
    }
]

The error is

openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

yilinwu123 avatar Jul 26 '24 05:07 yilinwu123

Hey all - I've updated the code to support Ollama's native tool calling. I've done some testing with Llama3.1 8b (you'll need to ollama pull them again if you downloaded them before so they have tool calling incorporated) and it works okay in that it will use the normal tools messaging format - unfortunately, it's not perfect and I noticed it will also do the continuous tool calling cycle (keeps recommending a tool call even when they've been run).

If you have already been running this code you'll need to update your Ollama package to 0.3.0 (pip install -U ollama).

I have left the manual tool calling in because it allows you to run tool calling with any Ollama model and it runs fairly well with my testing so far.

A few parameters to consider now for your llm config: native_tool_calls : True (default) or False (uses manual tool calling) hide_tools: ['never', 'if_all_run','if_any_run']

The hide_tools is useful for hiding tools once they have been run, this helps stop the LLM recommending tools over and over.

I've updated the documentation page as well with this detail, so see the local-ollama.ipynb file in the PR to find out how to use them.

marklysze avatar Jul 27 '24 02:07 marklysze

Hi @marklysze ! Thanks a lot for your reply. I change llm_config (two versions with two urls) as below. But there is a new error with API key. May I kindly check do I need to use litellm --model ollama.llama3 when I set "client_host"? The url for litellm is "http://127.0.0.1:4000". And I checked ollama, the local host url is "http://localhost:11434". Thanks a lot for your support.

config_list = [
    {
        "api_type": "ollama",
        "model": "llama3",
        "client_host": "http://127.0.0.1:4000",
        "seed": 43,
        "cache_seed": None
    }
]
config_list = [
    {
        "api_type": "ollama",
        "model": "llama3",
        "client_host": "http://localhost:11434",
        "seed": 43,
        "cache_seed": None
    }
]

The error is

openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

Hey @yilinwu123, hmmmm, the api_key shouldn't be required. Are you able to show how you're running it by providing a bit more code. And just checking that you've using the code from this branch?

marklysze avatar Jul 27 '24 02:07 marklysze

Hi @marklysze ! Thanks a lot for your reply. I change llm_config (two versions with two urls) as below. But there is a new error with API key. May I kindly check do I need to use litellm --model ollama.llama3 when I set "client_host"? The url for litellm is "http://127.0.0.1:4000". And I checked ollama, the local host url is "http://localhost:11434". Thanks a lot for your support.

config_list = [
    {
        "api_type": "ollama",
        "model": "llama3",
        "client_host": "http://127.0.0.1:4000",
        "seed": 43,
        "cache_seed": None
    }
]
config_list = [
    {
        "api_type": "ollama",
        "model": "llama3",
        "client_host": "http://localhost:11434",
        "seed": 43,
        "cache_seed": None
    }
]

The error is

openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

Hey @yilinwu123, hmmmm, the api_key shouldn't be required. Are you able to show how you're running it by providing a bit more code. And just checking that you've using the code from this branch?

Hi @marklysze! Thanks a lot for your help. I first create a conda environment using terminal and make sure Ollama is running on my computer. Then I use litellm --model ollama/llama3 and get the url "http://127.0.0.1:4000". Then I open visual studio and run the code under this conda environment. The code is below (I also tried to set client_host to be "http://localhost:11434"):

import autogen
from typing import Literal
from typing_extensions import Annotated



config_list = [
    {
        "api_type": "ollama",
        "model": "llama3",
        "client_host": "http://127.0.0.1:4000",
        "seed": 43,
        "cache_seed": None
    }
]

llm_config = {"config_list": config_list}


chatbot = autogen.AssistantAgent(
    name="chatbot",
    system_message="For currency exchange tasks, "
        "only use the functions you have been provided with.",
    llm_config=llm_config,
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
)

CurrencySymbol = Literal["USD", "EUR"]


def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
    if base_currency == quote_currency:
        return 1.0
    elif base_currency == "USD" and quote_currency == "EUR":
        return 1 / 1.1
    elif base_currency == "EUR" and quote_currency == "USD":
        return 1.1
    else:
        raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")


@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
    base_amount: Annotated[float, "Amount of currency in base_currency"],
    base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
    quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
    quote_amount = exchange_rate(base_currency, quote_currency) * float(base_amount)
    return f"{format(quote_amount, '.2f')} {quote_currency}"


res = user_proxy.initiate_chat(
    chatbot,
    message="How much is 123.45 EUR in USD?",
    summary_method="reflection_with_llm",
)

print(f"SUMMARY: {res.summary['content']}")

yilinwu123 avatar Jul 28 '24 06:07 yilinwu123

Hi @marklysze! Thanks a lot for your help. I first create a conda environment using terminal and make sure Ollama is running on my computer. Then I use litellm --model ollama/llama3 and get the url "http://127.0.0.1:4000". Then I open visual studio and run the code under this conda environment. The code is below (I also tried to set client_host to be "http://localhost:11434"):

Ah @yilinwu123, I see... it appears you're running LiteLLM (with Ollama support) rather than Ollama directly.

To get this to work with this branch you need to be just running Ollama without LiteLLM. The client_host should be the Ollama URL (not LiteLLM).

To make sure you have the right URL, don't run LiteLLM and try something like this in your terminal or command prompt:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Why is the sky blue?"
}'

... this should run Ollama inference. Change the URL if it's not working.

If you can't find the right URL, make sure you've installed Ollama correctly: https://ollama.com/download

marklysze avatar Jul 28 '24 20:07 marklysze

curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt": "Why is the sky blue?" }'

Hi @marklysze ! I tried to run Ollama without litellm. I copied the url "http://localhost:11434" in browser and it shows 'ollama is running'. I followed the step and run the provided command in terminal (it seems to work and the response is ""model":"llama3","created_at":"2024-07- 29T13:16:12.784813Z","response":"","done":true,"done_reason":"stop","context"). But there is still an api error when I run the code. May I kindly check how to get the url of ollama on my local computer? Thanks a lot for your help. Really appreciate it.

yilinwu123 avatar Jul 29 '24 13:07 yilinwu123

Hi @marklysze ! I tried to run Ollama without litellm. I copied the url "http://localhost:11434" in browser and it shows 'ollama is running'. I followed the step and run the provided command in terminal (it seems to work and the response is ""model":"llama3","created_at":"2024-07- 29T13:16:12.784813Z","response":"","done":true,"done_reason":"stop","context"). But there is still an api error when I run the code. May I kindly check how to get the url of ollama on my local computer? Thanks a lot for your help. Really appreciate it.

hey @yilinwu123 , this might help you with understanding the RESTAPI of the ollama. here is the link

Hk669 avatar Jul 29 '24 14:07 Hk669

Hi @marklysze ! I tried to run Ollama without litellm. I copied the url "http://localhost:11434" in browser and it shows 'ollama is running'. I followed the step and run the provided command in terminal (it seems to work and the response is ""model":"llama3","created_at":"2024-07- 29T13:16:12.784813Z","response":"","done":true,"done_reason":"stop","context"). But there is still an api error when I run the code. May I kindly check how to get the url of ollama on my local computer? Thanks a lot for your help. Really appreciate it.

hey @yilinwu123 , this might help you with understanding the RESTAPI of the ollama. here is the link

@yilinwu123, if you're able to hop onto the AutoGen discord (https://aka.ms/autogen-dc), feel free to message me (username msze) and I'll try and help you there.

marklysze avatar Jul 29 '24 19:07 marklysze

Okay - I think if we can review and try and get this out and available for developers to use and provide feedback that would be great.

marklysze avatar Aug 07 '24 00:08 marklysze