ml-commons
ml-commons copied to clipboard
[META] Agent framework enhancement
A new type of agent plan-and-solve The idea is inspired by papers Plan-and-Solve and LLMCompiler. Basically it is that we use LLM to plan all steps in DAG first, then the agent executes steps in DAG in parallel. Based on the execution result, return the result or ask LLM to replan.
Compared with the current conversational agent (ReAct), the plan-and-solve agent that we will introduce has some pros as following.
- Less latency - don’t have to interact with LLM for each step, only for (re)planning step.
- Cost saving - less LLM invocations.
- Better accuracy - force LLM optimize the solution targeting to solve the entire task, not only for single step using a greedy mode.
- Execute tool in parallel - Planning all steps to a DAG makes tool execution in parallel possible.
Input/Output parser We will introduce a generic input/output parser interface to parse different input/output schemas for LLM.
Streaming A streaming way can help applications to provide better customer experiences.
Async agent execution An agent execution is a time-consuming invocation. Supporting an async way can provide more flexibility to build applications.
Caching Calling LLM is pretty expensive in terms of latency and cost. In some cases, we could cache responses from LLMs to reuse.