autogen
autogen copied to clipboard
Ollama Client (with tool calling)
An Ollama client! Run your local models with AutoGen using a dedicated client class.
One of the key features of this library (and which is still very much experimental) is the support for tool calling. This is done "manually" by injecting tools into the prompt and translating between AutoGen's tool call objects and text messages (updated to also support Ollama's native tool calling). The use of tool calling will be described in further detail below but essentially you should be able to get up and running with it as it stands without customising the text injected to support it.
This manual tool calling approach and the actual text injected is an initial attempt to handle tool calling so if you can help improve it, please do!
I'll use the client with some notebooks and local models and summarise the results in another comment.
To run the code, you'll need to install the ollama and fix-busted-json packages (these will be automatically installed when this is merged and you install through pip install pyautogen[ollama]):
pip install ollama
pip install fix-busted-json
2024-07-27: Updated to include Ollama's native tool calling (just released, v0.3.0 Ollama library)
Related issue numbers
#2893
Checks
- [X] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
- [X] I've added tests (if relevant) corresponding to the changes introduced in this PR.
- [X] I've made sure all auto checks have passed.
Basic program:
# THIS TESTS: TWO AGENTS WITH TERMINATION
altmodel_llm_config = {
"config_list":
[
{
"api_type": "ollama",
"model": "llama3:8b-instruct-q6_K",
"client_host": "http://192.168.0.1:11434",
"seed": 42
}
]
}
from autogen import ConversableAgent
jack = ConversableAgent(
"Jack",
llm_config=altmodel_llm_config,
system_message="Your name is Jack and you are a comedian in a two-person comedy show.",
is_termination_msg=lambda x: True if "FINISH" in x["content"] else False
)
emma = ConversableAgent(
"Emma",
llm_config=altmodel_llm_config,
system_message="Your name is Emma and you are a comedian in two-person comedy show. Say the word FINISH ONLY AFTER you've heard 2 of Jack's jokes.",
is_termination_msg=lambda x: True if "FINISH" in x["content"] else False
)
chat_result = jack.initiate_chat(emma, message="Emma, tell me a joke about goldfish and peanut butter.", max_turns=10)
Add "stream": True to the config to use streaming.
Tool calling:
import autogen
from typing import Literal
from typing_extensions import Annotated
# THIS TESTS: TOOL CALLING
altmodel_llm_config = {
"config_list":
[
{
"api_type": "ollama",
"model": "llama3:8b-instruct-q6_K",
"client_host": "http://192.168.0.1:11434",
"seed": 43,
"cache_seed": None
}
]
}
# Create the agent and include examples of the function calling JSON in the prompt
# to help guide the model
chatbot = autogen.AssistantAgent(
name="chatbot",
system_message="For currency exchange tasks, "
"only use the functions you have been provided with.",
llm_config=altmodel_llm_config,
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
human_input_mode="NEVER",
max_consecutive_auto_reply=1,
)
CurrencySymbol = Literal["USD", "EUR"]
# Define our function that we expect to call
def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
if base_currency == quote_currency:
return 1.0
elif base_currency == "USD" and quote_currency == "EUR":
return 1 / 1.1
elif base_currency == "EUR" and quote_currency == "USD":
return 1.1
else:
raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")
# Register the function with the agent
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
base_amount: Annotated[float, "Amount of currency in base_currency"],
base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
quote_amount = exchange_rate(base_currency, quote_currency) * base_amount
return f"{format(quote_amount, '.2f')} {quote_currency}"
# start the conversation
res = user_proxy.initiate_chat(
chatbot,
message="How much is 123.45 EUR in USD?",
summary_method="reflection_with_llm",
)
print(f"SUMMARY: {res.summary['content']}")
and result:
user_proxy (to chatbot):
How much is 123.45 EUR in USD?
--------------------------------------------------------------------------------
chatbot (to user_proxy):
***** Suggested tool call (ollama_func_3384): currency_calculator *****
Arguments:
{"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"}
***********************************************************************
--------------------------------------------------------------------------------
>>>>>>>> EXECUTING FUNCTION currency_calculator...
user_proxy (to chatbot):
user_proxy (to chatbot):
***** Response from calling tool (ollama_func_3384) *****
135.80 USD
*********************************************************
--------------------------------------------------------------------------------
chatbot (to user_proxy):
The result is 135.80 USD.
--------------------------------------------------------------------------------
SUMMARY: 123.45 EUR is equivalent to 135.80 USD.
Parallel tool calling (LLM recommends multiple tool calls at a time):
import os
import autogen
import json
from typing import Literal
from typing_extensions import Annotated
# THIS TESTS: PARALLEL TOOL CALLING
altmodel_llm_config = {
"config_list":
[
{
"api_type": "ollama",
"model": "llama3:8b-instruct-q6_K",
"client_host": "http://192.168.0.1:11434",
"seed": 43,
"cache_seed": None,
"hide_tools": "if_all_run"
}
]
}
# Create the agent and include examples of the function calling JSON in the prompt
# to help guide the model
chatbot = autogen.AssistantAgent(
name="chatbot",
system_message="For currency exchange and weather forecasting tasks, "
"only use the functions you have been provided with.",
llm_config=altmodel_llm_config,
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
human_input_mode="NEVER",
max_consecutive_auto_reply=1,
)
# Currency Exchange function
CurrencySymbol = Literal["USD", "EUR"]
# Define our function that we expect to call
def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
if base_currency == quote_currency:
return 1.0
elif base_currency == "USD" and quote_currency == "EUR":
return 1 / 1.1
elif base_currency == "EUR" and quote_currency == "USD":
return 1.1
else:
raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")
# Register the function with the agent
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
base_amount: Annotated[float, "Amount of currency in base_currency"],
base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
quote_amount = exchange_rate(base_currency, quote_currency) * base_amount
return f"{format(quote_amount, '.2f')} {quote_currency}"
# Weather function
# Example function to make available to model
def get_current_weather(location, unit="fahrenheit"):
"""Get the weather for some location"""
if "chicago" in location.lower():
return json.dumps({"location": "Chicago", "temperature": "13", "unit": unit})
elif "san francisco" in location.lower():
return json.dumps({"location": "San Francisco", "temperature": "55", "unit": unit})
elif "new york" in location.lower():
return json.dumps({"location": "New York", "temperature": "11", "unit": unit})
else:
return json.dumps({"location": location, "temperature": "unknown"})
# Register the function with the agent
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Weather forecast for US cities.")
def weather_forecast(
location: Annotated[str, "City name"],
) -> str:
weather_details = get_current_weather(location=location)
weather = json.loads(weather_details)
return f"{weather['location']} will be {weather['temperature']} degrees {weather['unit']}"
# start the conversation
res = user_proxy.initiate_chat(
chatbot,
message="What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday?",
summary_method="reflection_with_llm",
)
print(f"SUMMARY: {res.summary['content']}")
and result:
user_proxy (to chatbot):
What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday?
--------------------------------------------------------------------------------
chatbot (to user_proxy):
***** Suggested tool call (ollama_func_5948): weather_forecast *****
Arguments:
{"location": "New York"}
********************************************************************
***** Suggested tool call (ollama_func_5949): currency_calculator *****
Arguments:
{"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"}
***********************************************************************
--------------------------------------------------------------------------------
>>>>>>>> EXECUTING FUNCTION weather_forecast...
>>>>>>>> EXECUTING FUNCTION currency_calculator...
user_proxy (to chatbot):
user_proxy (to chatbot):
***** Response from calling tool (ollama_func_5948) *****
New York will be 11 degrees fahrenheit
*********************************************************
--------------------------------------------------------------------------------
user_proxy (to chatbot):
***** Response from calling tool (ollama_func_5949) *****
135.80 USD
*********************************************************
--------------------------------------------------------------------------------
chatbot (to user_proxy):
It will be 11 degrees Fahrenheit in New York and $135.80 is the equivalent of €123.45 in USD, making it a suitable amount to spend on your holiday.
--------------------------------------------------------------------------------
SUMMARY: New York will be 11 degrees Fahrenheit. €123.45 is equivalent to $135.80 in USD.
thanks for the contribution @marklysze , looks great!!
Codecov Report
Attention: Patch coverage is 4.94297% with 250 lines in your changes missing coverage. Please review.
Project coverage is 29.68%. Comparing base (
38cce47) to head (00d2a58). Report is 7 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #3056 +/- ##
==========================================
- Coverage 30.49% 29.68% -0.82%
==========================================
Files 113 115 +2
Lines 12284 12725 +441
Branches 2602 2709 +107
==========================================
+ Hits 3746 3777 +31
- Misses 8210 8599 +389
- Partials 328 349 +21
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 29.68% <4.94%> (-0.80%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
as discussed in Discord with @marklysze, there are some benefits to separating the 'tool calling' from the client itself (ie if we have multiple clients that lack tool calling in their API, the same code would work for both, and to avoid duplicate code and having to update it in multiple places, it makes more sense to use a Capability. SImilarly for Message Cleanup (ie adjusting the messages given to a agent), the variety of tweaks needed depends greatly on both the client and the model, often it's related to the model's prompt template (which is beyond the scope of discussion here, except that it's fairly invisible to most devs until it errors), so putting this code into a Capability also makes more sense.
I went down this road, and realized the above was a better answer, and kudos to Mark for his work overall. Let's adjust this so that nobody writing a client in the future needs to rewrite the above pieces, and devs can add a few lines of code when they get an error, or need tool calling due to a model or client.
I'll try and get my code organized (planned on it last week, but was dealing with some illness) and get things posted as PR, and we can compare and see what makes the most sense to move forward with....
Ollama client in c# example needed
Ollama client in c# example needed
@GeorgeS2019, is that something you would be able to assist with?
may i suggest improving this ollama object with test containers ollama object ? it's quite nice when everything is containerized, that's why. for coding you can do everything in code which is also a nice thing.
if this doesnt fit, i'll take notes on how you do it , and maybe try my hand at providing this as an alternative.
may i suggest improving this ollama object with test containers ollama object ? it's quite nice when everything is containerized, that's why. for coding you can do everything in code which is also a nice thing.
if this doesnt fit, i'll take notes on how you do it , and maybe try my hand at providing this as an alternative.
Hi @Josephrp, sounds like it's worth looking in to. Is there a reference to what you're thinking? I'm not sure what that involves.
https://testcontainers.com/modules/ollama/
this is the object
this is the pull request for python : https://github.com/testcontainers/testcontainers-python/pull/618/files
but in general i really like this "ephemeral" way to make containers for this kind of purpose
hey there , hope i didnt take the wind out of your sails, i'm keen to look into this together if you want :-)
hey there , hope i didnt take the wind out of your sails, i'm keen to look into this together if you want :-)
Hey @Josephrp, sorry I'm currently out with the flu :(
I think it's a good topic to investigate. My focus is on having local LLMs available within AutoGen, so don't worry about taking the wind out of my sails :).
Perhaps a proof of concept that others can provide feedback on. Not sure if that's tricky but may be good for others to weigh in and also good to know if it's ollama specific or could be used for other client classes.
Sorry I haven't provided feedback on it.
hey get better soon okay :-)
When you get ready to publish samples, recommend using ("http://localhost:11434") or 127.0.0.1:11434 instead of "http://192.168.0.1:11434",
Hi @marklysze! I tried your second example (tool calling with currency exchange). Here is my output. It seems that it can call the functions, but there is no "chatbot (to user_proxy): The result is 135.80 USD." And there is a warning message. May I kindly ask how to fix it? Thanks a lot.
Output
user_proxy (to chatbot):
How much is 123.45 EUR in USD?
[autogen.oai.client: 07-20 15:50:54] {329} WARNING - Model ollama/llama3 is not found. The cost will be 0. In your config_list, add field {"price" : [prompt_price_per_1k, completion_token_price_per_1k]} for customized pricing.
chatbot (to user_proxy):
***** Suggested tool call (call_2e7ece9b-6fb2-4871-b503-5f91bfefd58e): currency_calculator *****
Arguments:
{"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"}
************************************************************************************************
Provide feedback to chatbot. Press enter to skip and use auto-reply, or type 'exit' to end the conversation:
>>>>>>>> NO HUMAN INPUT RECEIVED.
>>>>>>>> USING AUTO REPLY...
>>>>>>>> EXECUTING FUNCTION currency_calculator...
user_proxy (to chatbot):
user_proxy (to chatbot):
***** Response from calling tool (call_2e7ece9b-6fb2-4871-b503-5f91bfefd58e) *****
135.80 USD
**********************************************************************************
[autogen.oai.client: 07-20 15:51:01] {329} WARNING - Model ollama/llama3 is not found. The cost will be 0. In your config_list, add field {"price" : [prompt_price_per_1k, completion_token_price_per_1k]} for customized pricing.
chatbot (to user_proxy):
***** Suggested tool call (call_0c00a35f-9be7-41da-a57f-b5599049ab3b): currency_calculator *****
Arguments:
{"base_amount": 135.8, "base_currency": "USD", "quote_currency": "EUR"}
************************************************************************************************
--------------------------------------------------------------------------------
Provide feedback to chatbot. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: print(f"SUMMARY: {res.summary['content']}")
user_proxy (to chatbot):
Here is my code:
import autogen
from typing import Literal
from typing_extensions import Annotated
config_list = [
{
"api_type": "ollama",
"model": "llama3",
"base_url": "http://127.0.0.1:4000",
"seed": 43,
"cache_seed": None
}
]
llm_config = {"config_list": config_list}
chatbot = autogen.AssistantAgent(
name="chatbot",
system_message="For currency exchange tasks, "
"only use the functions you have been provided with.",
llm_config=llm_config,
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
human_input_mode="NEVER",
max_consecutive_auto_reply=1,
)
CurrencySymbol = Literal["USD", "EUR"]
def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
if base_currency == quote_currency:
return 1.0
elif base_currency == "USD" and quote_currency == "EUR":
return 1 / 1.1
elif base_currency == "EUR" and quote_currency == "USD":
return 1.1
else:
raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
base_amount: Annotated[float, "Amount of currency in base_currency"],
base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
quote_amount = exchange_rate(base_currency, quote_currency) * float(base_amount)
return f"{format(quote_amount, '.2f')} {quote_currency}"
res = user_proxy.initiate_chat(
chatbot,
message="How much is 123.45 EUR in USD?",
summary_method="reflection_with_llm",
)
print(f"SUMMARY: {res.summary['content']}")
⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.
Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.
🔎 Detected hardcoded secret in your pull request
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 10404695 | Triggered | Generic High Entropy Secret | 49f4c192262a38fda23f8e94362e26e4175434af | test/oai/test_utils.py | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Trying to run the autogen deeplearning.ai online course with ollama through this branch.
Ran into some issues with lesson 2 Sequential Chats and Customer Onboarding which I believe indicates an issue with the ollama client in the current form.
Stacktrace on error:
Traceback (most recent call last): File "/workspaces/multi-agent-autogen-experiments/lesson2_local.py", line 106, in <module> chat_results = initiate_chats(chats) ^^^^^^^^^^^^^^^^^^^^^ File "/home/autogen/autogen/autogen/agentchat/chat.py", line 199, in initiate_chats __post_carryover_processing(chat_info) File "/home/autogen/autogen/autogen/agentchat/chat.py", line 119, in __post_carryover_processing ("\n").join([t for t in chat_info["carryover"]]) TypeError: sequence item 0: expected str instance, dict found
https://github.com/microsoft/autogen/blob/ollamaclient/autogen/agentchat/chat.py#L199 is where the issue occurs
Unclear to me exactly what is causing this, but the same code works if the llm_config points to an openai models.
Code for reproducing
import pprint
import os
#Load .env with OPENAI_API_KEY
from dotenv import load_dotenv
load_dotenv()
llm_config = {
"config_list":
[
{
"api_type": "ollama",
"model": "llama3:8b",
"client_host": "http://host.docker.internal:11434",
"seed": 42,
"price": [0.0,0.0]
}
]
}
#The example below works with openai llm_config
#llm_config = {"model": "gpt-4o-mini", "api_key": os.environ["OPENAI_API_KEY"]}
from autogen import ConversableAgent
onboarding_personal_information_agent = ConversableAgent(
name="Onboarding Personal Information Agent",
system_message='''You are a helpful customer onboarding agent,
you are here to help new customers get started with our product.
Your job is to gather customer's name and location.
Do not ask for other information. Return 'TERMINATE'
when you have gathered all the information.''',
llm_config=llm_config,
code_execution_config=False,
human_input_mode="NEVER",
)
onboarding_topic_preference_agent = ConversableAgent(
name="Onboarding Topic preference Agent",
system_message='''You are a helpful customer onboarding agent,
you are here to help new customers get started with our product.
Your job is to gather customer's preferences on news topics.
Do not ask for other information.
Return 'TERMINATE' when you have gathered all the information.''',
llm_config=llm_config,
code_execution_config=False,
human_input_mode="NEVER",
)
customer_engagement_agent = ConversableAgent(
name="Customer Engagement Agent",
system_message='''You are a helpful customer service agent
here to provide fun for the customer based on the user's
personal information and topic preferences.
This could include fun facts, jokes, or interesting stories.
Make sure to make it engaging and fun!
Return 'TERMINATE' when you are done.''',
llm_config=llm_config,
code_execution_config=False,
human_input_mode="NEVER",
is_termination_msg=lambda msg: "terminate" in msg.get("content").lower(),
)
customer_proxy_agent = ConversableAgent(
name="customer_proxy_agent",
llm_config=False,
code_execution_config=False,
human_input_mode="ALWAYS",
is_termination_msg=lambda msg: "terminate" in msg.get("content").lower(),
)
chats = [
{
"sender": onboarding_personal_information_agent,
"recipient": customer_proxy_agent,
"message":
"Hello, I'm here to help you get started with our product."
"Could you tell me your name and location?",
"summary_method": "reflection_with_llm",
"summary_args": {
"summary_prompt" : "Return the customer information "
"into as JSON object only: "
"{'name': '', 'location': ''}",
},
"max_turns": 2,
"clear_history" : True
},
{
"sender": onboarding_topic_preference_agent,
"recipient": customer_proxy_agent,
"message":
"Great! Could you tell me what topics you are "
"interested in reading about?",
"summary_method": "reflection_with_llm",
"max_turns": 1,
"clear_history" : False
},
{
"sender": customer_proxy_agent,
"recipient": customer_engagement_agent,
"message": "Let's find something fun to read.",
"max_turns": 1,
"summary_method": "reflection_with_llm",
},
]
from autogen import initiate_chats
chat_results = initiate_chats(chats)
for chat_result in chat_results:
print(chat_result.summary)
print("\n")
Hi @marklysze! I tried your second example (tool calling with currency exchange). Here is my output. It seems that it can call the functions, but there is no "chatbot (to user_proxy): The result is 135.80 USD." And there is a warning message. May I kindly ask how to fix it? Thanks a lot.
Hi @yilinwu123, thanks for testing this out...
I ran your code but with my local config:
config_list = [
{
"api_type": "ollama",
"model": "llama3:instruct",
"client_host": "http://192.168.0.115:11434",
"seed": 43,
"cache_seed": None
}
]
Please note that the ollama client class uses client_host (not 'base_url'). I need to include the :instruct on my model name, but the default is instruct so you may not need to.
When I ran it I did get the summary showing. Can you try with client_host?
Trying to run the autogen deeplearning.ai online course with ollama through this branch.
Ran into some issues with lesson 2 Sequential Chats and Customer Onboarding which I believe indicates an issue with the ollama client in the current form.
Stacktrace on error:
Traceback (most recent call last): File "/workspaces/multi-agent-autogen-experiments/lesson2_local.py", line 106, in <module> chat_results = initiate_chats(chats) ^^^^^^^^^^^^^^^^^^^^^ File "/home/autogen/autogen/autogen/agentchat/chat.py", line 199, in initiate_chats __post_carryover_processing(chat_info) File "/home/autogen/autogen/autogen/agentchat/chat.py", line 119, in __post_carryover_processing ("\n").join([t for t in chat_info["carryover"]]) TypeError: sequence item 0: expected str instance, dict foundhttps://github.com/microsoft/autogen/blob/ollamaclient/autogen/agentchat/chat.py#L199 is where the issue occurs
Unclear to me exactly what is causing this, but the same code works if the llm_config points to an openai models.
Hi @elsewhat, good to get this code tested... thanks for trying it out against the course material.
I just tried with my config:
llm_config = {
"config_list":
[
{
"api_type": "ollama",
"model": "llama3:instruct",
"client_host": "http://192.168.0.115:11434",
"seed": 42,
"price": [0.0,0.0]
}
]
}
And I'm getting through the code:
********************************************************************************
Starting a new chat....
********************************************************************************
Onboarding Personal Information Agent (to customer_proxy_agent):
Hello, I'm here to help you get started with our product.Could you tell me your name and location?
--------------------------------------------------------------------------------
Provide feedback to Onboarding Personal Information Agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: Mark and I'm in New York
customer_proxy_agent (to Onboarding Personal Information Agent):
Mark and I'm in New York
--------------------------------------------------------------------------------
Onboarding Personal Information Agent (to customer_proxy_agent):
So, just to confirm: your name is Mark and you're located in New York?
Please let me know if that's correct before I proceed.
(Note: waiting for confirmation)
--------------------------------------------------------------------------------
Provide feedback to Onboarding Personal Information Agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: Yes
customer_proxy_agent (to Onboarding Personal Information Agent):
Yes
--------------------------------------------------------------------------------
********************************************************************************
Starting a new chat....
********************************************************************************
Onboarding Topic preference Agent (to customer_proxy_agent):
Great! Could you tell me what topics you are interested in reading about?
Context:
{'name': 'Mark', 'location': 'New York'}
--------------------------------------------------------------------------------
Provide feedback to Onboarding Topic preference Agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: Architecture
customer_proxy_agent (to Onboarding Topic preference Agent):
Architecture
--------------------------------------------------------------------------------
********************************************************************************
Starting a new chat....
********************************************************************************
customer_proxy_agent (to Customer Engagement Agent):
Let's find something fun to read.
Context:
{'name': 'Mark', 'location': 'New York'}
I'm interested in reading about architecture.
--------------------------------------------------------------------------------
Customer Engagement Agent (to customer_proxy_agent):
Hi Mark! It's great to chat with you about architecture!
As a New Yorker, I'm sure you appreciate the iconic buildings and structures that make up the city's skyline. Did you know that the Empire State Building was the tallest building in the world when it was completed in 1931? It held that title for over 40 years!
But let's talk about some fun facts about architecture. Did you know that:
* The Guggenheim Museum in New York City has a unique spiral design, which allows visitors to see art from multiple angles?
* The Flatiron Building in Manhattan is one of the most iconic and recognizable buildings in the world?
* The Brooklyn Bridge, also in NYC, was the longest suspension bridge in the world when it opened in 1883?
Now, let me share a cool story about architecture. Have you heard about the High Line in Chelsea, Manhattan? It's an elevated park built on an old rail line. When the trains stopped running, the city decided to transform the area into a public green space. Today, it's one of the most popular parks in New York City!
I hope this sparks your interest in architecture, Mark! What do you think about these fascinating facts and stories? Would you like to learn more?
(When I'm done, I'll say "TERMINATE"!)
--------------------------------------------------------------------------------
{'content': "{'name': 'Mark', 'location': 'New York'}", 'role': 'assistant', 'function_call': None, 'tool_calls': None}
{'content': "I'm interested in reading about architecture.", 'role': 'assistant', 'function_call': None, 'tool_calls': None}
{'content': "New York-based architecture facts include the Empire State Building's former tallest-in-the-world status, the Guggenheim Museum's unique spiral design, and iconic buildings like the Flatiron Building and Brooklyn Bridge. The High Line in Chelsea is also mentioned as a popular elevated park built on an old rail line.", 'role': 'assistant', 'function_call': None, 'tool_calls': None}
How far are you getting through before the crash?
Ollama has just released an updated library, version 0.3.0 that includes the support for tool calling! I'll test this out and see whether it's better than what's in place so far.
Hi @marklysze! I tried your second example (tool calling with currency exchange). Here is my output. It seems that it can call the functions, but there is no "chatbot (to user_proxy): The result is 135.80 USD." And there is a warning message. May I kindly ask how to fix it? Thanks a lot.
Output
user_proxy (to chatbot): How much is 123.45 EUR in USD? [autogen.oai.client: 07-20 15:50:54] {329} WARNING - Model ollama/llama3 is not found. The cost will be 0. In your config_list, add field {"price" : [prompt_price_per_1k, completion_token_price_per_1k]} for customized pricing. chatbot (to user_proxy): ***** Suggested tool call (call_2e7ece9b-6fb2-4871-b503-5f91bfefd58e): currency_calculator ***** Arguments: {"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"} ************************************************************************************************ Provide feedback to chatbot. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: >>>>>>>> NO HUMAN INPUT RECEIVED. >>>>>>>> USING AUTO REPLY... >>>>>>>> EXECUTING FUNCTION currency_calculator... user_proxy (to chatbot): user_proxy (to chatbot): ***** Response from calling tool (call_2e7ece9b-6fb2-4871-b503-5f91bfefd58e) ***** 135.80 USD ********************************************************************************** [autogen.oai.client: 07-20 15:51:01] {329} WARNING - Model ollama/llama3 is not found. The cost will be 0. In your config_list, add field {"price" : [prompt_price_per_1k, completion_token_price_per_1k]} for customized pricing. chatbot (to user_proxy): ***** Suggested tool call (call_0c00a35f-9be7-41da-a57f-b5599049ab3b): currency_calculator ***** Arguments: {"base_amount": 135.8, "base_currency": "USD", "quote_currency": "EUR"} ************************************************************************************************ -------------------------------------------------------------------------------- Provide feedback to chatbot. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: print(f"SUMMARY: {res.summary['content']}") user_proxy (to chatbot):Here is my code:
import autogen from typing import Literal from typing_extensions import Annotated config_list = [ { "api_type": "ollama", "model": "llama3", "base_url": "http://127.0.0.1:4000", "seed": 43, "cache_seed": None } ] llm_config = {"config_list": config_list} chatbot = autogen.AssistantAgent( name="chatbot", system_message="For currency exchange tasks, " "only use the functions you have been provided with.", llm_config=llm_config, ) user_proxy = autogen.UserProxyAgent( name="user_proxy", is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""), human_input_mode="NEVER", max_consecutive_auto_reply=1, ) CurrencySymbol = Literal["USD", "EUR"] def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float: if base_currency == quote_currency: return 1.0 elif base_currency == "USD" and quote_currency == "EUR": return 1 / 1.1 elif base_currency == "EUR" and quote_currency == "USD": return 1.1 else: raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}") @user_proxy.register_for_execution() @chatbot.register_for_llm(description="Currency exchange calculator.") def currency_calculator( base_amount: Annotated[float, "Amount of currency in base_currency"], base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD", quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR", ) -> str: quote_amount = exchange_rate(base_currency, quote_currency) * float(base_amount) return f"{format(quote_amount, '.2f')} {quote_currency}" res = user_proxy.initiate_chat( chatbot, message="How much is 123.45 EUR in USD?", summary_method="reflection_with_llm", ) print(f"SUMMARY: {res.summary['content']}")
Hi @marklysze ! Thanks a lot for your reply. I change llm_config (two versions with two urls) as below. But there is a new error with API key. May I kindly check do I need to use litellm --model ollama.llama3 when I set "client_host"? The url for litellm is "http://127.0.0.1:4000". And I checked ollama, the local host url is "http://localhost:11434". Thanks a lot for your support.
config_list = [
{
"api_type": "ollama",
"model": "llama3",
"client_host": "http://127.0.0.1:4000",
"seed": 43,
"cache_seed": None
}
]
config_list = [
{
"api_type": "ollama",
"model": "llama3",
"client_host": "http://localhost:11434",
"seed": 43,
"cache_seed": None
}
]
The error is
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
Hey all - I've updated the code to support Ollama's native tool calling. I've done some testing with Llama3.1 8b (you'll need to ollama pull them again if you downloaded them before so they have tool calling incorporated) and it works okay in that it will use the normal tools messaging format - unfortunately, it's not perfect and I noticed it will also do the continuous tool calling cycle (keeps recommending a tool call even when they've been run).
If you have already been running this code you'll need to update your Ollama package to 0.3.0 (pip install -U ollama).
I have left the manual tool calling in because it allows you to run tool calling with any Ollama model and it runs fairly well with my testing so far.
A few parameters to consider now for your llm config:
native_tool_calls : True (default) or False (uses manual tool calling)
hide_tools: ['never', 'if_all_run','if_any_run']
The hide_tools is useful for hiding tools once they have been run, this helps stop the LLM recommending tools over and over.
I've updated the documentation page as well with this detail, so see the local-ollama.ipynb file in the PR to find out how to use them.
Hi @marklysze ! Thanks a lot for your reply. I change llm_config (two versions with two urls) as below. But there is a new error with API key. May I kindly check do I need to use litellm --model ollama.llama3 when I set "client_host"? The url for litellm is "http://127.0.0.1:4000". And I checked ollama, the local host url is "http://localhost:11434". Thanks a lot for your support.
config_list = [ { "api_type": "ollama", "model": "llama3", "client_host": "http://127.0.0.1:4000", "seed": 43, "cache_seed": None } ]config_list = [ { "api_type": "ollama", "model": "llama3", "client_host": "http://localhost:11434", "seed": 43, "cache_seed": None } ]The error is
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
Hey @yilinwu123, hmmmm, the api_key shouldn't be required. Are you able to show how you're running it by providing a bit more code. And just checking that you've using the code from this branch?
Hi @marklysze ! Thanks a lot for your reply. I change llm_config (two versions with two urls) as below. But there is a new error with API key. May I kindly check do I need to use litellm --model ollama.llama3 when I set "client_host"? The url for litellm is "http://127.0.0.1:4000". And I checked ollama, the local host url is "http://localhost:11434". Thanks a lot for your support.
config_list = [ { "api_type": "ollama", "model": "llama3", "client_host": "http://127.0.0.1:4000", "seed": 43, "cache_seed": None } ]config_list = [ { "api_type": "ollama", "model": "llama3", "client_host": "http://localhost:11434", "seed": 43, "cache_seed": None } ]The error is
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variableHey @yilinwu123, hmmmm, the
api_keyshouldn't be required. Are you able to show how you're running it by providing a bit more code. And just checking that you've using the code from this branch?
Hi @marklysze! Thanks a lot for your help. I first create a conda environment using terminal and make sure Ollama is running on my computer. Then I use litellm --model ollama/llama3 and get the url "http://127.0.0.1:4000". Then I open visual studio and run the code under this conda environment. The code is below (I also tried to set client_host to be "http://localhost:11434"):
import autogen
from typing import Literal
from typing_extensions import Annotated
config_list = [
{
"api_type": "ollama",
"model": "llama3",
"client_host": "http://127.0.0.1:4000",
"seed": 43,
"cache_seed": None
}
]
llm_config = {"config_list": config_list}
chatbot = autogen.AssistantAgent(
name="chatbot",
system_message="For currency exchange tasks, "
"only use the functions you have been provided with.",
llm_config=llm_config,
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
human_input_mode="NEVER",
max_consecutive_auto_reply=1,
)
CurrencySymbol = Literal["USD", "EUR"]
def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
if base_currency == quote_currency:
return 1.0
elif base_currency == "USD" and quote_currency == "EUR":
return 1 / 1.1
elif base_currency == "EUR" and quote_currency == "USD":
return 1.1
else:
raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
base_amount: Annotated[float, "Amount of currency in base_currency"],
base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
quote_amount = exchange_rate(base_currency, quote_currency) * float(base_amount)
return f"{format(quote_amount, '.2f')} {quote_currency}"
res = user_proxy.initiate_chat(
chatbot,
message="How much is 123.45 EUR in USD?",
summary_method="reflection_with_llm",
)
print(f"SUMMARY: {res.summary['content']}")
Hi @marklysze! Thanks a lot for your help. I first create a conda environment using terminal and make sure Ollama is running on my computer. Then I use litellm --model ollama/llama3 and get the url "http://127.0.0.1:4000". Then I open visual studio and run the code under this conda environment. The code is below (I also tried to set client_host to be "http://localhost:11434"):
Ah @yilinwu123, I see... it appears you're running LiteLLM (with Ollama support) rather than Ollama directly.
To get this to work with this branch you need to be just running Ollama without LiteLLM. The client_host should be the Ollama URL (not LiteLLM).
To make sure you have the right URL, don't run LiteLLM and try something like this in your terminal or command prompt:
curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt": "Why is the sky blue?"
}'
... this should run Ollama inference. Change the URL if it's not working.
If you can't find the right URL, make sure you've installed Ollama correctly: https://ollama.com/download
curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt": "Why is the sky blue?" }'
Hi @marklysze ! I tried to run Ollama without litellm. I copied the url "http://localhost:11434" in browser and it shows 'ollama is running'. I followed the step and run the provided command in terminal (it seems to work and the response is ""model":"llama3","created_at":"2024-07- 29T13:16:12.784813Z","response":"","done":true,"done_reason":"stop","context"). But there is still an api error when I run the code. May I kindly check how to get the url of ollama on my local computer? Thanks a lot for your help. Really appreciate it.
Hi @marklysze ! I tried to run Ollama without litellm. I copied the url "http://localhost:11434" in browser and it shows 'ollama is running'. I followed the step and run the provided command in terminal (it seems to work and the response is ""model":"llama3","created_at":"2024-07- 29T13:16:12.784813Z","response":"","done":true,"done_reason":"stop","context"). But there is still an api error when I run the code. May I kindly check how to get the url of ollama on my local computer? Thanks a lot for your help. Really appreciate it.
hey @yilinwu123 , this might help you with understanding the RESTAPI of the ollama. here is the link
Hi @marklysze ! I tried to run Ollama without litellm. I copied the url "http://localhost:11434" in browser and it shows 'ollama is running'. I followed the step and run the provided command in terminal (it seems to work and the response is ""model":"llama3","created_at":"2024-07- 29T13:16:12.784813Z","response":"","done":true,"done_reason":"stop","context"). But there is still an api error when I run the code. May I kindly check how to get the url of ollama on my local computer? Thanks a lot for your help. Really appreciate it.
hey @yilinwu123 , this might help you with understanding the RESTAPI of the ollama. here is the link
@yilinwu123, if you're able to hop onto the AutoGen discord (https://aka.ms/autogen-dc), feel free to message me (username msze) and I'll try and help you there.
Okay - I think if we can review and try and get this out and available for developers to use and provide feedback that would be great.