[Issue]: Unable to enable tool calling when using a custom model
Describe the issue
I am trying to combine the following two notebooks into one:
- Agent Chat with custom model loading
- Auto Generated Agent Chat: Task Solving with Langchain Provided Tools as Functions
In my simple and naive script to test the concept, I created two agent instances assistant as AssistantAgent() and user_proxy as UserProxyAgent(). The challenge is how to initialize and register assistant when I need to inform both of assistant and user_proxy of the provided tools as functions. However, it will raise error in running either assistant.register_model_client() or user_proxy.initiate_chat(). I don't know if the problem is my script or if there is a bug. I am grateful if you can help me on this.
Steps to reproduce
Step 1: Confirm the custom model's local path
Confirm the model Mistral-7B-OpenOrca exists in the file path /dev/Open-Orca/Mistral-7B-OpenOrca locally
Step 2: Create JSON file named as OAI_CONFIG_LIST.json under the directory /dev/ with the following content
[
{
"model": "gpt-4",
"api_key": "<your OpenAI API key here>"
},
{
"model": "/dev/Open-Orca/Mistral-7B-OpenOrca",
"model_client_cls": "CustomModelClient",
"params": {
"max_length": 1000
}
}
]
Step 3: Run following Python script
import math
from types import SimpleNamespace, Optional, Type
import autogen
from autogen import AssistantAgent, UserProxyAgent
from transformers import AutoTokenizer, GenerationConfig, AutoModelForCausalLM
from langchain.pydantic_v1 import BaseModel, Field
from langchain.tools import BaseTool
class CustomModelClient:
def __init__(self, config, **kwargs):
self.device = config.get("device", "cpu")
self.model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path = config["model"])
self.model_name = config["model"]
self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path = config["model"], use_fast = False)
self.tokenizer.pad_token_id = self.tokenizer.eos_token_id
# params are set by the user and consumed by the user since they are providing a custom model
# so anything can be done here
gen_config_params = config.get("params", {})
self.max_length = gen_config_params.get("max_length", 256)
def create(self, params):
if params.get("stream", False) and "messages" in params:
raise NotImplementedError("Local models do not support streaming.")
else:
num_of_responses = params.get("n", 1)
# can create my own data response class
# here using SimpleNamespace for simplicity
# as long as it adheres to the ClientResponseProtocol
response = SimpleNamespace()
inputs = self.tokenizer.apply_chat_template(
conversation = params["messages"],
return_tensors="pt",
add_generation_prompt=True
).to(self.device)
inputs_length = inputs.shape[-1]
# add inputs_length to max_length
max_length = self.max_length + inputs_length
generation_config = GenerationConfig(
max_length = max_length,
eos_token_id = self.tokenizer.eos_token_id,
pad_token_id = self.tokenizer.pad_token_id,
)
response.choices = []
response.model = self.model_name
for _ in range(num_of_responses):
outputs = self.model.generate(
inputs = inputs,
generation_config=generation_config
)
# Decode only the newly generated text, excluding the prompt
text = self.tokenizer.decode(token_ids = outputs[0, inputs_length:])
choice = SimpleNamespace()
choice.message = SimpleNamespace()
choice.message.content = text
choice.message.function_call = None
response.choices.append(choice)
return response
def message_retrieval(self, response):
"""Retrieve the messages from the response."""
choices = response.choices
return [choice.message.content for choice in choices]
def cost(self, response) -> float:
"""Calculate the cost of the response."""
response.cost = 0
return 0
@staticmethod
def get_usage(response):
# returns a dict of prompt_tokens, completion_tokens, total_tokens, cost, model
# if usage needs to be tracked, else None
return {}
class CustomToolInput(BaseModel):
income: float = Field()
class CustomTool(BaseTool):
name = "tax_calculator"
description = "Use this tool when you need to calculate the tax using the income"
args_schema: Type[BaseModel] = CustomToolInput
def _run(self, fw: float):
return float(income) * math.pi / 100
# Define a function to generate llm_config from a LangChain tool
def generate_llm_config(tool):
# Define the function schema based on the tool's args_schema
function_schema = {
"name": tool.name.lower().replace(" ", "_"),
"description": tool.description,
"parameters": {
"type": "object",
"properties": {},
"required": [],
},
}
if tool.args is not None:
function_schema["parameters"]["properties"] = tool.args
return function_schema
custom_tool = CustomTool()
config_list_custom = autogen.config_list_from_json(
env_or_file = "OAI_CONFIG_LIST.json",
file_location = "/dev/",
filter_dict = {"model_client_cls": ["CustomModelClient"]},
)
user_proxy = UserProxyAgent(
name = "user_proxy",
is_termination_msg = lambda x: x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE"),
human_input_mode = "NEVER",
max_consecutive_auto_reply = 2,
code_execution_config = {
"work_dir": "coding",
"use_docker": False, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
"timeout": 600,
"last_n_messages": 1
},
)
user_proxy.register_function(
function_map={
custom_tool.name: custom_tool._run
}
)
llm_config = {
"functions": [generate_llm_config(custom_tool)],
"config_list": config_list_custom,
"timeout": 120
}
assistant = AssistantAgent(
name = "assistant",
llm_config = llm_config,
system_message = "For coding tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done."
)
assistant.register_model_client(model_client_cls = CustomModelClient)
with autogen.Cache.disk():
user_proxy.initiate_chat(assistant, message="when the income is 100, calculate the tax")
Screenshots and logs
I am getting error after running assistant.register_model_client(model_client_cls = CustomModelClient):
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_1734/1825630986.py in <cell line: 1>()
----> 1 assistant.register_model_client(model_client_cls = CustomModelClient)
/opt/software/Miniconda/lib/python3.8/site-packages/autogen/agentchat/conversable_agent.py in register_model_client(self, model_client_cls, **kwargs)
2296 **kwargs: The kwargs for the custom client class to be initialized with
2297 """
-> 2298 self.client.register_model_client(model_client_cls, **kwargs)
2299
2300 def register_hook(self, hookable_method: Callable, hook: Callable):
/opt/software/Miniconda/lib/python3.8/site-packages/autogen/oai/client.py in register_model_client(self, model_client_cls, **kwargs)
431 )
432 else:
--> 433 raise ValueError(
434 f'Model client "{model_client_cls.__name__}" is being registered but was not found in the config_list. '
435 f'Please make sure to include an entry in the config_list with "model_client_cls": "{model_client_cls.__name__}"'
ValueError: Model client "CustomModelClient" is being registered but was not found in the config_list. Please make sure to include an entry in the config_list with "model_client_cls": "CustomModelClient"
and I am getting error after running user_proxy.initiate_chat(assistant, message="when the income is 100, calculate the tax"):
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/tmp/ipykernel_1734/532364807.py in <cell line: 1>()
1 with autogen.Cache.disk():
----> 2 user_proxy.initiate_chat(assistant, message="when the income is 100, calculate the tax")
/opt/software/Miniconda/lib/python3.8/site-packages/autogen/agentchat/conversable_agent.py in initiate_chat(self, recipient, clear_history, silent, cache, **context)
791 agent.client_cache = cache
792 self._prepare_chat(recipient, clear_history)
--> 793 self.send(self.generate_init_message(**context), recipient, silent=silent)
794 summary = self._summarize_chat(
795 context.get("summary_method"),
/opt/software/Miniconda/lib/python3.8/site-packages/autogen/agentchat/conversable_agent.py in send(self, message, recipient, request_reply, silent)
502 valid = self._append_oai_message(message, "assistant", recipient)
503 if valid:
--> 504 recipient.receive(message, self, request_reply, silent)
505 else:
506 raise ValueError(
/opt/software/Miniconda/lib/python3.8/site-packages/autogen/agentchat/conversable_agent.py in receive(self, message, sender, request_reply, silent)
677 if request_reply is False or request_reply is None and self.reply_at_receive[sender] is False:
678 return
--> 679 reply = self.generate_reply(messages=self.chat_messages[sender], sender=sender)
680 if reply is not None:
681 self.send(reply, sender, silent=silent)
/opt/software/Miniconda/lib/python3.8/site-packages/autogen/agentchat/conversable_agent.py in generate_reply(self, messages, sender, **kwargs)
1635 continue
1636 if self._match_trigger(reply_func_tuple["trigger"], sender):
-> 1637 final, reply = reply_func(self, messages=messages, sender=sender, config=reply_func_tuple["config"])
1638 if final:
1639 return reply
/opt/software/Miniconda/lib/python3.8/site-packages/autogen/agentchat/conversable_agent.py in generate_oai_reply(self, messages, sender, config)
1053 if messages is None:
1054 messages = self._oai_messages[sender]
-> 1055 extracted_response = self._generate_oai_reply_from_client(
1056 client, self._oai_system_message + messages, self.client_cache
1057 )
/opt/software/Miniconda/lib/python3.8/site-packages/autogen/agentchat/conversable_agent.py in _generate_oai_reply_from_client(self, llm_client, messages, cache)
1072
1073 # TODO: #1143 handle token limit exceeded error
-> 1074 response = llm_client.create(
1075 context=messages[-1].pop("context", None),
1076 messages=all_messages,
/opt/software/Miniconda/lib/python3.8/site-packages/autogen/oai/client.py in create(self, **config)
526 ]
527 if non_activated:
--> 528 raise RuntimeError(
529 f"Model client(s) {non_activated} are not activated. Please register the custom model clients using `register_model_client` or filter them out form the config list."
530 )
RuntimeError: Model client(s) ['CustomModelClient'] are not activated. Please register the custom model clients using `register_model_client` or filter them out form the config list.
Additional Information
- AutoGen Version: 0.2.13
- Operation System: Linux
- Python Version: 3.8.11
If I do what is shown below instead, the script will run and complete. However, the custom model is obviously NOT aware of the tool provided based on the output.
assistant = AssistantAgent(
name = "assistant",
llm_config = {"config_list": config_list_custom},
system_message = "For coding tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done."
)
assistant.register_model_client(model_client_cls = CustomModelClient)
with autogen.Cache.disk():
user_proxy.initiate_chat(assistant, message="when the income is 100, calculate the tax")
Can you confirm if Mistral-7B-OpenOrca supports function call format in its prompt template? To enable function all, the model itself needs to support/fine-tuned with function call format as well.
I can not repro, the above code works for me. I would suggest checking that the correct OAI_CONFIG_FILE is being picked up, because the message ValueError: Model client "CustomModelClient" is being registered but was not found in the config_list. Please make sure to include an entry in the config_list with "model_client_cls": "CustomModelClient" implies that
Can you confirm if
Mistral-7B-OpenOrcasupports function call format in its prompt template? To enable function all, the model itself needs to support/fine-tuned with function call format as well.
Yeah. That's really a good call-out! I missed that.
I finally got myself unblocked by using the class CustomModelClientWithArguments. However, tool calling was not successful, and my best guess was that Mistral-7B-OpenOrca was not finetuned for tool calling task.
Here is my solution:
import math
from types import SimpleNamespace, Optional, Type
import autogen
from autogen import AssistantAgent, UserProxyAgent
from transformers import AutoTokenizer, GenerationConfig, AutoModelForCausalLM
from langchain.pydantic_v1 import BaseModel, Field
from langchain.tools import BaseTool
class CustomModelClient:
def __init__(self, config, **kwargs):
self.device = config.get("device", "cpu")
self.model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path = config["model"])
self.model_name = config["model"]
self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path = config["model"], use_fast = False)
self.tokenizer.pad_token_id = self.tokenizer.eos_token_id
# params are set by the user and consumed by the user since they are providing a custom model
# so anything can be done here
gen_config_params = config.get("params", {})
self.max_length = gen_config_params.get("max_length", 256)
def create(self, params):
if params.get("stream", False) and "messages" in params:
raise NotImplementedError("Local models do not support streaming.")
else:
num_of_responses = params.get("n", 1)
# can create my own data response class
# here using SimpleNamespace for simplicity
# as long as it adheres to the ClientResponseProtocol
response = SimpleNamespace()
inputs = self.tokenizer.apply_chat_template(
conversation = params["messages"],
return_tensors="pt",
add_generation_prompt=True
).to(self.device)
inputs_length = inputs.shape[-1]
# add inputs_length to max_length
max_length = self.max_length + inputs_length
generation_config = GenerationConfig(
max_length = max_length,
eos_token_id = self.tokenizer.eos_token_id,
pad_token_id = self.tokenizer.pad_token_id,
)
response.choices = []
response.model = self.model_name
for _ in range(num_of_responses):
outputs = self.model.generate(
inputs = inputs,
generation_config=generation_config
)
# Decode only the newly generated text, excluding the prompt
text = self.tokenizer.decode(token_ids = outputs[0, inputs_length:])
choice = SimpleNamespace()
choice.message = SimpleNamespace()
choice.message.content = text
choice.message.function_call = None
response.choices.append(choice)
return response
def message_retrieval(self, response):
"""Retrieve the messages from the response."""
choices = response.choices
return [choice.message.content for choice in choices]
def cost(self, response) -> float:
"""Calculate the cost of the response."""
response.cost = 0
return 0
@staticmethod
def get_usage(response):
# returns a dict of prompt_tokens, completion_tokens, total_tokens, cost, model
# if usage needs to be tracked, else None
return {}
class CustomModelClientWithArguments(CustomModelClient):
def __init__(self, config, loaded_model, tokenizer, **kwargs):
logger.info(f"CustomModelClientWithArguments config: {config}")
self.device = config.get("device", "cpu")
self.model = loaded_model
self.model_name = config["model"]
self.tokenizer = tokenizer
self.tokenizer.pad_token_id = tokenizer.eos_token_id
gen_config_params = config.get("params", {})
self.max_length = gen_config_params.get("max_length", 256)
class CustomToolInput(BaseModel):
income: float = Field()
class CustomTool(BaseTool):
name = "tax_calculator"
description = "Use this tool when you need to calculate the tax using the income"
args_schema: Type[BaseModel] = CustomToolInput
def _run(self, income: float):
return float(income) * math.pi / 100
# Define a function to generate llm_config from a LangChain tool
def generate_llm_config(tool):
# Define the function schema based on the tool's args_schema
function_schema = {
"name": tool.name.lower().replace(" ", "_"),
"description": tool.description,
"parameters": {
"type": "object",
"properties": {},
"required": [],
},
}
if tool.args is not None:
function_schema["parameters"]["properties"] = tool.args
return function_schema
custom_tool = CustomTool()
config_list_custom = autogen.config_list_from_json(
env_or_file = "OAI_CONFIG_LIST.json",
file_location = "/dev/",
filter_dict = {"model_client_cls": ["CustomModelClientWithArguments"]},
)
config = config_list_custom[0]
loaded_model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path = config["model"])
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path = config["model"], use_fast = False)
user_proxy = UserProxyAgent(
name = "user_proxy",
is_termination_msg = lambda x: x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE"),
human_input_mode = "NEVER",
max_consecutive_auto_reply = 2,
code_execution_config = {
"work_dir": "coding",
"use_docker": False, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
"timeout": 600,
"last_n_messages": 1
},
)
user_proxy.register_function(
function_map={
custom_tool.name: custom_tool._run
}
)
llm_config = {
"functions": [generate_llm_config(custom_tool)],
"config_list": config_list_custom,
"timeout": 120
}
assistant = AssistantAgent(
name = "assistant",
llm_config = llm_config,
system_message = "For coding tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done."
)
assistant.register_model_client(
model_client_cls = CustomModelClientWithArguments,
loaded_model = loaded_model,
tokenizer = tokenizer
)
with autogen.Cache.disk():
user_proxy.initiate_chat(assistant, message="when the income is 100, calculate the tax")
I posted this issue to our Discord channel to see if there are some there that can help. https://discord.com/channels/1153072414184452236/1201369716057440287
I posted this issue to our Discord channel to see if there are some there that can help. https://discord.com/channels/1153072414184452236/1201369716057440287
Thank you! But I am seeing nothing after clicking the link. I am new to Discord, do I miss anything?
you might need a fine-tuned model. Trelis on huggingface has a couple,, but there is a dataset that you can use if you want to train your own. And for the discord autogen server I think you have to look for the channel #alt-models
@woodswift recently mistral model have started to support tool call. Have you checked? https://docs.mistral.ai/api/#operation/createChatCompletion
@woodswift recently mistral model have started to support tool call. Have you checked? https://docs.mistral.ai/api/#operation/createChatCompletion
Oh, thank you for sharing! I have not tried it yet, so will do shortly :)
@woodswift, you should be able to do it through LiteLLM+Ollama (note: Ollama released a new version, 0.1.29, you'll need that). You can also test through together.ai who have Mistral and Mixtral models that support function calling).
Oh, if you are using LiteLLM + Ollama, please be sure to use "ollama_chat/
@woodswift, you should be able to do it through LiteLLM+Ollama (note: Ollama released a new version, 0.1.29, you'll need that). You can also test through together.ai who have Mistral and Mixtral models that support function calling).
Oh, if you are using LiteLLM + Ollama, please be sure to use "ollama_chat/
" rather than "ollama/ ".
Is that work?
@woodswift can you update us?
@woodswift, you should be able to do it through LiteLLM+Ollama (note: Ollama released a new version, 0.1.29, you'll need that). You can also test through together.ai who have Mistral and Mixtral models that support function calling).
Oh, if you are using LiteLLM + Ollama, please be sure to use "ollama_chat/" rather than "ollama/".
hi, I use ollama with Mistral, but still cann't use function calling :( and I don't understand what dose "if you are using LiteLLM + Ollama, please be sure to use "ollama_chat/" rather than "ollama/" " means, can you explain that more?
Hi @JarkimZhu, no problem, when you run your LiteLLM server you need to use "ollama_chat" instead of "ollama", here's an example:
litellm --model ollama_chat/llama2
- I created several functions with custom model "Mixtral 8x7B", and i can see it in assistant.llm_comfig['tools'] and user_proxy.function_map, but I didn't see the LOG like '***** Suggested too Call'.
- And after examinging the source code, I'm still unsure where the key “tool_call” are being added in message["tool_calls"]. Someone help, or some suggestions?
- Does "Mixtral 8x7B" support tool call?
Thank you!
Can you tell us how you are running the model? LiteLLM + Ollama, together.ai, etc.
If it's LiteLLM can you please share the command line.
And any sample code you are using would help.
Thanks!
solution is discussed in the discussion #3196