bee-agent-framework icon indicating copy to clipboard operation
bee-agent-framework copied to clipboard

Investigate dynamic tool selection/filtering

Open michael-desmond opened this issue 1 year ago • 12 comments

Is your feature request related to a problem? Please describe. For a particular user message not all of the available tools may be necessary to produce a response. The inclusion of the entire tool set introduces superfluous tokens in the LLM input.

Describe the solution you'd like A way for the agent to select a subset of tools in response to a given user message during agent execution.

Describe alternatives you've considered This touches on decomposition of agent functions. An alternative would be a way to compose an agent from a set of more primitive elements with predefined connections.

michael-desmond avatar Nov 19 '24 18:11 michael-desmond

@michael-desmond, not related to building the capability into the agent itself, but I did something simlar for a different reason in - https://developers.redhat.com/blog/2024/10/25/building-agents-large-language-modelsllms-and-nodejs

mhdawson avatar Dec 02 '24 16:12 mhdawson

@michael-desmond Could you comment a bit more on the alternative solution and the decomposition of agents? Is this motivated by the open-canvas?

matoushavlena avatar Dec 04 '24 14:12 matoushavlena

We could extend the emitted data in the start event, as shown in #224.

This would let anybody to do something like this.

const agent = new BeeAgent(...)

agent.emitter.on("start", async ({ memory, tools, meta }) => {
  const lastMessage = memory.messages.at(-1); // better check should be used

  if (lastMessage?.text.includes("weather")) {
    const newTools = tools.filter((tool) => tool.name.includes("weather"))
    tools.splice(0, Infinity, ...newTools);
  }
});

Tomas2D avatar Dec 04 '24 17:12 Tomas2D

@matoushavlena Yes it’s somewhat inspired by openCanvas, and generally by the langGraph approach to handling complexity. Right now a single agent (sys prompt) is responsible for selecting the tool, calling the tool, and producing a final answer. Conceivably this process could be handled by a set of nodes/components each with a much narrower scope that work together to (potentially) produce a more robust overall agent i.e. tool selection (look at dialog history and choose a tool), tool calling (call the given tool), response generation etc.

michael-desmond avatar Dec 04 '24 17:12 michael-desmond

Accelerated Discovery team uses a normal RAG pattern on Tool Names and Descriptions to filter the list of "equipped tools" to select only 10 most relevant based on MMR algorithm.

Here is a pubilcaly available documente that matches the the pattern they used from LangChain: How to handle large numbers of tools. Courtesy of @prattyushmangal.

geneknit avatar Dec 04 '24 19:12 geneknit

RAG with MMR can work well, but from my point of view, it is useful only if you have many tools and want to do pre-filtering to speed up the decision time.

An example of how this can be done with the help of structured generation within the framework.

const prompt = "What is the current weather in San Francisco?";
const maxTools = 1;

const llm = new OllamaChatLLM();
const driver = new JsonDriver(llm);
const tools = [new GoogleSearchTool(), new WikipediaTool(), new OpenMeteoTool()] as const;

const response = await driver.generate(
  z.object({
    tools: z.array(z.enum(tools.map((tool) => tool.name) as [string, ...string[]])).max(maxTools),
  }),
  [
    BaseMessage.of({
      role: Role.USER,
      text: `# Tools
${tools.map((tool) => `Tool Name: ${tool.name}\nTool Description: ${tool.description}`).join("\n")}

# Objective
Give me a list of the most relevant tools to answer the following prompt.
Prompt: ${prompt}`,
    }),
  ],
);

console.info(response.tools); // ["OpenMeteo"]

Tomas2D avatar Dec 05 '24 09:12 Tomas2D

Hi @Tomas2D, your remark on speed up decision time is right, but our main reasoning for tool filtering was to do with accuracy.

Instead of showing the LLM all available tools for decision making, if you show a filtered list, then it may be more likely to avoid hallucinations and improve its accuracy in tool selection.

So benefits of pre-filtering = accuracy ⬆️ and inference time ⬇️

prattyushmangal avatar Dec 05 '24 09:12 prattyushmangal

I see. What if you pick the wrong subset of tools, and when is the tool selection done?

Because Bee Agent is an extended ReAct agent, I see two approaches.

  1. We pre-filter tools at the very beginning (faster, which may lead to an agent being unable to respond).
  2. We pre-filter tools before every iteration (slower, leads to better results).

In addition, the Agent's system prompt, which contains all the tool-related information, sits at the beginning of the conversation history, followed by conversation messages. I am worried that this tool's filtering could confuse the agent because the agent might see old messages requiring tool calls to tools that are not available in the current interaction.

Tomas2D avatar Dec 05 '24 12:12 Tomas2D

I see. What if you pick the wrong subset of tools, and when is the tool selection done?

Yep no way to mitigate this completely but can reduce the likelihood by choosing a number of filtered tools to return to be sufficiently high.

In addition, the Agent's system prompt, which contains all the tool-related information, sits at the beginning of the conversation history, followed by conversation messages. I am worried that this tool's filtering could confuse the agent because the agent might see old messages requiring tool calls to tools that are not available in the current interaction.

On this concern, I think we may have to assess different system prompt configurations and the behaviours exhibited by the LLMs. In my LangChain based implementation, the tools are only present in the prompts to the LLMs when there is a "tool selection" activity being undertaken by the agent. All LLM prompts only have "need to know" information in their prompts. I know for Bee currently it is one single prompt based Agent but in the future, what I suggest might be suited for a more multi-agent based approach.

prattyushmangal avatar Dec 05 '24 13:12 prattyushmangal

In other words, you don't preserve intermediate steps between iterations, which has its own pros and cons.

Tomas2D avatar Dec 05 '24 14:12 Tomas2D

this is a fluid area and we should not harden our stance yet. I think changing tool selection at different steps is a fine approach to try.

dakshiagrawal avatar Dec 06 '24 19:12 dakshiagrawal

Would extending events emitted by Agent through be helpful for further exploration? As depicted in https://github.com/i-am-bee/bee-agent-framework/pull/224.

Tomas2D avatar Dec 11 '24 13:12 Tomas2D