autogen
autogen copied to clipboard
Refactorization of ConversableAgent to unify async and sync code and better extensibility
Note: this is a refactorization study and will change a lot as we explore different design patterns and how they fit together in the framework. Feedback and suggestions are more than welcome.
Why are these changes needed?
The current state.
Several issues and PRs involve how to extend the ConversableAgent class. Effort like Teachability for every agent (#534) and the concept of modularized agent capability is a big step toward solving this from a higher level. There are still low-level extension issues such as:
- Logging #1146
- Streaming message to frontend #394 #1290 #1313
- Several bugs caused by async and sync code not "synced" in code changes like #1242 and issues related to unexpected behaviors of using async public methods (#1012)
While some patches have been made to address those issues but not the root cause, which is that the ConversableAgent was mostly designed to work in a console environment but not yet a server-side library, and the code duplication in async and sync public methods.
What is this PR for.
The main goal of this PR is to address the above and at the same time introducing a design pattern that would make it much easier to add low-level functionalities like logging, content filtering, RAG-style context expansion, and custom termination mechanism and many others. We also want to demonstrate that high-level capabilities like Teachability can be composed of re-usable low-level components.
What is this PR NOT for:
This PR is not for breaking the existing ConversableAgent methods. The ConversableAgent has many great methods like the register_for_llm and initiate_chat that are loved by users.
What can you do to help:
This is not a small change, so we want to have as many feedback AND HELP as possible. We will post some work items on this page as we move forward, and you are welcome to contribute!
Update 01/16/2024
@ekzhu and I came up with the new scheme which introduces a single new function/decorator hookable and implements Middleware pattern. An example of how to use it is as follows:
class A:
def __init__(self, name: str) -> None:
self.name = name
@hookable
def go(self, *args: Any, **kwargs: Any) -> str:
return f"{self.name}.{format_function(self.go, *args, **kwargs)}"
class MyMiddleware:
def __init__(self, name: str) -> None:
self.name = name
def call(self, *args: Any, next: Callable[..., Any], **kwargs: Any) -> str:
retval = next(*args, **kwargs)
return f"{self.name}.{format_function(self.call, retval)}"
def trigger(self, *args: Any, **kwargs: Any) -> bool:
return not ("skip_middleware" in kwargs and kwargs["skip_middleware"])
a = A("a")
add_middleware(A.go, MyMiddleware("mw"))
assert a.go(1, 2, 3, a=4, b=5) == "mw.call(a.go(1, 2, 3, a=4, b=5))"
assert a.go(1, 2, 3, a=4, b=5, skip_middleware=False) == "mw.call(a.go(1, 2, 3, a=4, b=5, skip_middleware=False))"
assert a.go(1, 2, 3, a=4, b=5, skip_middleware=True) == "a.go(1, 2, 3, a=4, b=5, skip_middleware=True)"
add_middleware(A.go, MyMiddleware("MW"))
assert a.go(1, 2, 3, a=4, b=5) == "mw.call(MW.call(a.go(1, 2, 3, a=4, b=5)))"
There can be more than one hookable method in each class. We can use this to implement reply and hook functions and probably many other things.
Update 01/17/2024
@tyler-suard-parker @joshkyh @bitnom @jackgerrits @rickyloynd-microsoft
You are welcome to try out this branch. We are currently working on replacing register_hook for now then we will move on to refactor existing generate_***_reply functions in the ConversableAgent class into a middleware and upgrade the generate_reply method to use the middleware -- so the current functionalities stay the same.
But we need someone to think about how to implement some of these new features using middleware, and add them to the generate_reply to enable new functionalities.
- Incoming and outgoing message streaming to web socket
- Incoming and outgoing message logging
- RAG-style message context expansion, e.g., retrieve relevant context from a vector database and expand the incoming message's content.
- Human middleware to short-circuit the rest of the middleware pipeline -- think a static file middleware in a web framework.
- Out-going message filtering. e.g., filtering api keys, passwords, etc.
Here is a simple example of logging middleware that logs incoming and outgoing messages.
class LoggingMiddleware:
def call(self, message: Dict, next: Callable[[Dict], Dict]) -> Dict:
logging.info(f"Incoming: {retval}")
retval = next(message)
logging.info(f"Outgoing: {retval}")
return retval
Here is another for filtering out OpenAI API keys.
class FilterAPIKeyMiddleware:
def call(self, message: Dict, next: Callable[[Dict], Dict]) -> Dict:
retval = next(message)
if retval.get("content", False):
re.sub(r'(sk-\w{4})\w+', r'\1***', retval["content"])
return retval
Another one for simple RAG-style context expansion:
class RAGMiddleware:
def call(self, message: Dict, next: Callable[[Dict], Dict]) -> Dict:
if message.get("content", False):
# Expand the message content with some text retrieved from vector db.
expansion = vectordb.search(message["content"], k=1)[0]
message["content"] += expansion
retval = next(message)
return retval
We are also adding a decorator that would convert a function into a middleware, saving user the effort to write a class.
More updates 01/17/2024
Teachability is refactored using the Middleware pattern instead of hooks. This is the actual implementation right now:
def add_to_agent(self, agent: ConversableAgent):
"""Adds teachability to the given agent."""
self.teachable_agent = agent
# Register a middleware for processing the last message.
class ProcessLastMessageMiddleware:
def __init__(self, *, agent: ConversableAgent, teachability: Teachability):
self.teachability = teachability
self.agent = agent
def call(self, agent: ConversableAgent, user_text: str, *, next: Callable[[str], str]):
user_text = next(agent, user_text)
return self.teachability.process_last_message(user_text)
def trigger(self, agent: ConversableAgent, user_text: str):
return self.agent == agent
add_middleware(
ConversableAgent.process_last_message_user_text,
ProcessLastMessageMiddleware(agent=agent, teachability=self),
)
Whenever ConversableAgent.process_last_message_user_text is called, the ProcessLastMessageMiddleware.call is invoked and a wrapper to the original ConversableAgent.process_last_message_user_text is passed as the next parameter. All typing hints here are optional, they are present only to help understand what the expected parameters are.
There is some cleanup and error handling remaining, but this is basically it. As Erik mentioned above, it is easy to write a set of standard MIddleware that covers the most common use cases.
Related issue number
Checks
- [x] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
- [x] I've added tests (if relevant) corresponding to the changes introduced in this PR.
- [x] I've made sure all auto checks have passed.
Codecov Report
Attention: 209 lines in your changes are missing coverage. Please review.
Comparison is base (
1ab2354) 32.48% compared to head (bbdc8dd) 49.48%.
Additional details and impacted files
@@ Coverage Diff @@
## main #1240 +/- ##
===========================================
+ Coverage 32.48% 49.48% +17.00%
===========================================
Files 41 49 +8
Lines 4907 5252 +345
Branches 1120 1238 +118
===========================================
+ Hits 1594 2599 +1005
+ Misses 3187 2485 -702
- Partials 126 168 +42
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 49.42% <78.25%> (+16.98%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
This starts to resemble more and more like middleware pattern: https://learn.microsoft.com/en-us/aspnet/core/fundamentals/middleware/?view=aspnetcore-8.0
See the diagram above. Each request gets processed by a chained pipeline of middlewares. Each middleware does some work based on the message and current application state, and decides whether to pass control to the next one or "short-circuit" back to the previous one.
We can also use this design pattern to model our Conversable Agent, and I think it accurate describes how our agent currently behaves.
A simple LLM-based agent contains the following middlewares in order:
- Handles terminations (counter-based termination, message-based termination, and others) -- this is like a short-circuit middleware which returns before anything else is made.
- Message manipulation, i.e., for OPENAI message, we need to change the "role" key to "user" to fake a user message.
- Message augmentation. i.e., augment relevant context for each message by doing RAG-style retrieval. Teachability and memory can be considered this way.
- Bookkeeping, storing and logging incoming and outgoing messages.
- Each generate_reply method is a middleware as each of them gets triggered and decides whether to pass control to the next one. If it decides to short-circuit, the rest of the reply methods will be skipped.
@ekzhu exactly! Hooks seemed like the easiest thing to try it out because it used in Teachability only, but we should apply the same principle to everything else as you described above.
Maybe we can use this branch to try this design on teachability and perhaps a few other things, like loggings and content filtering. This would supersede #1215 and #1146 (@cheng-tan) so let's take the time to do it right.
@ekzhu exactly, this should be a part of making the API consistent across the board. Hence the draft status, otherwise this is good to go.
This might be the best pr of the repo rn. It's a better direction than the simple observers I've been using. It will be exciting to hook all methods.
Expanding on this draft would greatly benefit from a comprehensive refactor and utilization of pydantic to its full potential. I'll push whatever I can think of for that in my next work-session, and probably have a few methods worth adding to Hookable. There's a lot of potential there.
I've started down this path now. Here are some preliminary references in-case anyone is interested in existing implementations:
https://pypi.org/project/middleware/
https://pypi.org/project/observer-hooks/
https://pypi.org/project/func-hooks/
https://pypi.org/project/better-hooks no repo so I pushed to: https://github.com/bitnom/better-hooks
then I got the AI involved:
-
Wrapt: A module for decorators, wrappers, and monkey patching, offering control over the execution process and suitable for implementing pre-hooks and post-hooks.
Website: https://wrapt.readthedocs.io
-
Pluggy: The plugin management system used by pytest, allowing the definition of hook specifications and the implementation of plugins that can extend or modify the behavior of a host program.
Website: https://pluggy.readthedocs.io
-
Aspectlib: An aspect-oriented programming library for Python, which helps in separating concerns in a program, similar to middleware.
-
Events: A lightweight event library for adding hooks (event handlers) to Python code, implementing the observer pattern.
GitHub: https://github.com/pyeve/events
-
Blinker: Provides support for Signals and Slots, a callback mechanism that can be used in various applications.
Edit: fixed 2 of the links. I've not used any of these modules before, but having skimmed through their docs:
wrapt, pluggy, aspectlib, and blinker all look awesome. "Events" sounds not so great. I'm not suggesting we need to use an existing lib but it should be seriously considered. I'm sure a lot of problems have been solved in them. Some of these seem quite battle-tested.
With this, we could add @hookable_method to https://github.com/microsoft/autogen/blob/main/autogen/oai/client.py#L416 to hook up what coming to and from LLM, right?...
ref: https://github.com/microsoft/autogen/pull/1146
With this, we could add @hookable_method to https://github.com/microsoft/autogen/blob/main/autogen/oai/client.py#L416 to hook up what coming to and from LLM, right?...
ref: #1146
for sure. We could actually decorate literally every class/func/method in autogen to support the hooks system, which is what I think should be done.
note: This PR feels like it could take a while to be feature-complete. There is already a simpler PR ahead of this one which allows a custom openai wrapper: https://github.com/microsoft/autogen/pull/1217
Yes, i saw that other PR too. I like, IMHO, this generic hook aproach as lets you decouple logic from the base-code.
Maybe, one would even ENABLE_HOOKS="logging,other_ones", with already basic built-in hooks so users can just enable them like flags for common things.
Yes, i saw that other PR too. I like, IMHO, this generic hook aproach as lets you decouple logic from the base-code.
Maybe, one would even ENABLE_HOOKS="logging,other_ones", with already basic built-in hooks so users can just enable them like flags for common things.
really now I feel like subclassing and using one of those modules I listed on all the things.
Not coming from strong Python background so don't want to overly stress my opinion here. Still - As I read this PR conversations, and hanging around the relevant existing code in the repo for a while I feel like "override anything you want" approach would generate a problem in the other directions.
Here is how I see it - at the moment even for replacing a simple reply you have to work pretty hard - but at least it's clear what are the customizable parts of an agent. Or in other words, when does an "agent" is no longer a agent.
Basically, I love this PR direction of using an already established method for extending and mixing in new behavior. I also all for deduplicating the async specific code as much as possible. But - there should always be a clear vision of what are the ways this could be done and in which points. Otherwise maintaining a strong backward compatible API (which means user base can keep upgrading without forking, abandoning) will be almost impossible. Same goes for documentation - showing 100 possible ways of doing the same thing would confuse new users and reduce adoption rate.
I would suggest:
- Recognize from known use cases in the wild how do users WANT to extend and customize the library today. (not in a distant future)
- Design this extensibility looking into a future of endless variations on the above use cases (not other use cases)
- Look for an established Pythonian way this shoudl be done
- Hopefully find a backward compatible way of weaving that in without breaking exiting API
Hope this help, thank you so much for your work!
Not coming from strong Python background so don't want to overly stress my opinion here. Still - As I read this PR conversations, and hanging around the relevant existing code in the repo for a while I feel like "override anything you want" approach would generate a problem in the other directions.
Here is how I see it - at the moment even for replacing a simple reply you have to work pretty hard - but at least it's clear what are the customizable parts of an agent. Or in other words, when does an "agent" is no longer a agent.
Basically, I love this PR direction of using an already established method for extending and mixing in new behavior. I also all for deduplicating the async specific code as much as possible. But - there should always be a clear vision of what are the ways this could be done and in which points. Otherwise maintaining a strong backward compatible API (which means user base can keep upgrading without forking, abandoning) will be almost impossible. Same goes for documentation - showing 100 possible ways of doing the same thing would confuse new users and reduce adoption rate.
I would suggest:
- Recognize from known use cases in the wild how do users WANT to extend and customize the library today. (not in a distant future)
- Design this extensibility looking into a future of endless variations on the above use cases (not other use cases)
- Look for an established Pythonian way this shoudl be done
- Hopefully find a backward compatible way of weaving that in without breaking exiting API
Hope this help, thank you so much for your work!
Thank you for the valuable feedback! We need to recognize the fact that most autogen users who want to extend agent capabilities will not have strong python backgrounds. This should not prevent us from adopting clean and powerful extension mechanisms, but the machinery needs to be readable, carefully documented, and easy for almost anyone to use, as well as easy to maintain after all of us have moved on. If we require users to become versed in arcane libraries and abstruse programming patterns, those barriers will severely limit autogen's adoption.
I added support for async functions and for conditions. Now it is easy to create MiddleWare class that would support the functionality needed to replace hooks and reply functions with triggers.
To ground our discussion of refactorization, here's a summary of the hook-based process (introduced by #1091) that was used to refactor teachability into a composable, chainable capability for addition to any agent.
This hook process involves three mechanisms: Capability addition, Hook registration and Hook execution:
| Mechanisms | Agent objects and methods | Capability objects and methods |
|---|---|---|
| Capability addition extends an agent. |
The app instantiates an agent, like agent = ConversableAgent() or one of its subclasses. |
The app instantiates a capability, like teachability = Teachability() then calls teachability.add_to_agent(agent). |
| Hook registration connects a hook method to a hookable method. |
ConversableAgent implements and calls hookable methods, like ConversableAgent.process_last_message(). ConversableAgent implements ConversableAgent.register_hook( Â Â Â hookable_method, hook). |
The capability implements hook methods, like Teachability.process_last_message(). Teachability.add_to_agent(agent) calls agent.register_hook( Â Â Â hookable_method, hook). |
| Hook execution | ConversableAgent methods call its hookable methods, which call their registered hook methods (if any). |
Capability hook methods are called by their registered hookable methods. |
@ekzhu and I came up with the new scheme which introduces a single new function/decorator hookable and implements Middleware pattern. An example of how to use it is as follows:
class A:
def __init__(self, name: str) -> None:
self.name = name
@hookable
def go(self, *args: Any, **kwargs: Any) -> str:
return f"{self.name}.{format_function(self.go, *args, **kwargs)}"
class MyMiddleware:
def __init__(self, name: str) -> None:
self.name = name
def call(self, *args: Any, next: Callable[..., Any], **kwargs: Any) -> str:
retval = next(*args, **kwargs)
return f"{self.name}.{format_function(self.call, retval)}"
def trigger(self, *args: Any, **kwargs: Any) -> bool:
return not ("skip_middleware" in kwargs and kwargs["skip_middleware"])
a = A("a")
add_middleware(A.go, MyMiddleware("mw"))
assert a.go(1, 2, 3, a=4, b=5) == "mw.call(a.go(1, 2, 3, a=4, b=5))"
assert a.go(1, 2, 3, a=4, b=5, skip_middleware=False) == "mw.call(a.go(1, 2, 3, a=4, b=5, skip_middleware=False))"
assert a.go(1, 2, 3, a=4, b=5, skip_middleware=True) == "a.go(1, 2, 3, a=4, b=5, skip_middleware=True)"
add_middleware(A.go, MyMiddleware("MW"))
assert a.go(1, 2, 3, a=4, b=5) == "mw.call(MW.call(a.go(1, 2, 3, a=4, b=5)))"
There can be more than one hookable method in each class. We can use this to implement reply and hook functions and probably many other things.
@tyler-suard-parker @joshkyh @bitnom @jackgerrits @rickyloynd-microsoft
You are welcome to try out this branch. We are currently working on replacing register_hook for now then we will move on to refactor existing generate_***_reply functions in the ConversableAgent class into a middleware and upgrade the generate_reply method to use the middleware -- so the current functionalities stay the same.
But we need someone to think about how to implement some of these new features using middleware, and add them to the generate_reply to enable new functionalities.
- Incoming and outgoing message streaming to web socket
- Incoming and outgoing message logging
- RAG-style message context expansion, e.g., retrieve relevant context from a vector database and expand the incoming message's content.
- Human middleware to short-circuit the rest of the middleware pipeline -- think a static file middleware in a web framework.
- Out-going message filtering. e.g., filtering api keys, passwords, etc.
I like this PR and the idea behind it. IMO it's another good example of composition over inheritance and it enables user to extend an agent's ability without the necessary of creating a new type of agent.
Below are some questions
- How to short-cut middleware, is that simply don't call
next()function in middleware - Is there a way to revert
hafter callingadd_middleware(h, mv, ...)
Also one feedback for add_middleware(h, ...) function. My two cents here is it would be better for add_middleware to return a new h with middleware registered rather than modify h directly so that it won't change the behaviors of previous callers to h.
@LittleLittleCloud thanks! Could you provide the feedback inline in the code?
How to short-cut middleware, is that simply don't call next() function in middleware
Right. Once you return it hands control back to the previous middleware.
Is there a way to revert h after calling add_middleware(h, mv, ...)
The interface is in draft, but I believe you can easily reset the middleware from scratch if you choose to remove one of them.
I like this PR and the idea behind it. IMO it's another good example of composition over inheritance and it enables user to extend an agent's ability without the necessary of creating a new type of agent.
Below are some questions
- How to short-cut middleware, is that simply don't call
next()function in middleware
Yes
- Is there a way to revert
hafter callingadd_middleware(h, mv, ...)
Yes, there is set_middleware functions, but I'll also include few more like a pop, replace, etc.
Also one feedback for
add_middleware(h, ...)function. My two cents here is it would be better foradd_middlewareto return a newhwith middleware registered rather than modifyhdirectly so that it won't change the behaviors of previous callers toh.
You are absolutely right, I will change the add_middleware to be specific to an instance, not the class:
add_middleware(
agent.process_last_message_user_text,
ProcessLastMessageMiddleware(agent=agent, teachability=self),
)
Update 01/18/2024
Middleware registration methods add_middleware and set_middleware are refactored to be attached to bounded methods as suggested by @LittleLittleCloud and @ekzhu.
class A:
def __init__(self, name: str) -> None:
self.name = name
@register_for_middleware
def process_message(self, msg: str, skip_middleware: Optional[bool] = None) -> str:
return f"{self.name}.process_message({msg=})"
class MyMiddleware:
def __init__(self, name: str) -> None:
self.name = name
def call(self, *args: Any, next: Callable[..., Any], **kwargs: Any) -> str:
retval = next(*args, **kwargs)
return f"{self.name}.{format_function(self.call, retval)}"
def trigger(self, *args: Any, **kwargs: Any) -> bool:
return not ("skip_middleware" in kwargs and kwargs["skip_middleware"])
a = A("a")
mw = MyMiddleware("mw")
# middleware attached to a bounded method a.process_message
add_middleware(a.process_message, mw)
assert a.process_message("hello") == "mw.call(a.process_message(msg='hello'))"
mw2 = MyMiddleware("mw2")
add_middleware(a.process_message, mw2)
assert a.process_message("hello") == "mw.call(mw2.call(a.process_message(msg='hello')))"
b = A("b")
with pytest.raises(ValueError):
# mw is already attached to a.process_message
add_middleware(b.process_message, mw)
mwb = MyMiddleware("mwb")
add_middleware(b.process_message, mwb)
# only mwb middleware is called on calling b.process_message
assert b.process_message("hello") == "mwb.call(b.process_message(msg='hello'))"
# only mw and mw2 are called on a.process_message
assert a.process_message("hello") == "mw.call(mw2.call(a.process_message(msg='hello')))"
@ekzhu @davorrunje @LittleLittleCloud Let's keep in mind also that because AutoGen is easy for beginners to use, it would be great if the middlewares are easy for beginners to understand and use as well. Maybe some tutorials or examples for specific applications?
A framework should be difficult to write and easy to use :) This approach with middleware patterns has been proved to be very successful in Starlette and FastAPI. Of course, we need to write documentation and provide many examples. Again, FastAPI docs are a very good example on how to do it. Users don't really understand all the magic we do behind the scene, it just works as expected.
Update 01/19/2024
The code was internally refactored so it accurately uses the signature of a decorated function in call() methods of a MIddleware class. Another change is adding the a_call method to Middleware classes and removing of trigger method. Having both call() and a_call allows for the most efficient implementation. Decorators for automatically generating call from a_call and vice versa will be added shortly so we'll be still able to mix sync/async styles if we are willing to pay the price in reduced performance. Examples of Middleware classes were added to the ConversableAgent. Here is a simple one performing logging:
# notice that the `call`` signature must match the function decorated with `register_for_middleware`:
# passing arguments to call() functions must the the same as passing arguments
# to generate_reply() apart from next being passed as a keyword argument
# default values must also be the same
class _PrintReplyMiddleware:
def __init__(self, agent: Agent):
self._agent = agent
def call(
self,
messages: Optional[List[Dict]] = None,
sender: Optional[Agent] = None,
# next will be passed as a keyword argument
next: Optional[Callable[..., Any]] = None,
) -> Tuple[bool, Optional[str]]:
print(f"generate_reply() called: {sender} sending {messages[-1] if messages else messages}'")
retval = next(messages, sender)
return retval
async def a_call(
self,
messages: Optional[List[Dict]] = None,
sender: Optional[Agent] = None,
next: Optional[Callable[..., Any]] = None,
) -> Tuple[bool, Optional[str]]:
print(f"a_generate_reply() called: {sender} sending {messages[-1] if messages else messages}'")
retval = await next(messages, sender)
return retval
class ConversableAgent(Agent):
def __init__(self, *args, **kwargs):
...
# attaching middleware to a registered method
add_middleware(self.generate_reply, _PrintReplyMiddleware(self))
add_middleware(self.a_generate_reply, _PrintReplyMiddleware(self))
@register_for_middleware
def generate_reply(
self,
messages: Optional[List[Dict]] = None,
sender: Optional[Agent] = None,
exclude: Optional[List[Callable]] = None,
) -> Union[str, Dict, None]:
...
@register_for_middleware
async def a_generate_reply(
self,
messages: Optional[List[Dict]] = None,
sender: Optional[Agent] = None,
exclude: Optional[List[Callable]] = None,
) -> Union[str, Dict, None]:
...
All the tests are passing and we are ready to start refactoring the ConversableAgent class.
Update 01/20/2024
Created the following middleware:
ToolUseMiddlewareLLMMiddlewareCodeExecutionMiddlewareTerminationAndHumanReplyMiddlewareMessageStoreMiddlewareTeachabilityMiddleware
See autogen/middleware and contrib/capability/teachability.py
Refactored ConversableAgent by composing it using the middleware above. All public methods are backward-compatible.
Fixed some tests. The failing tests should be easy to fix.
Next step:
- Use wrapper to unify sync and async code path.
- Utilities for building middleware chain and validating
call(...)method signatures. - Fix all tests.
- Update code-level documentation, remove recommendation for subclassing.
Update 01/26/2024
Async/sync mixing works now in all cases
- Function and tool calling is working in all combinations of
async/synccalls. - Code execution works in
a_initialize_chatnow.
Quality improvements
All tests are passing now. Code coverage was significantly improved with the goal of having over 90% code covered by tests. Type annotations are fixed and mypy reports no errors in autogen/agentchat/middleware and test/agentchat/middleware folders.
autogen/agentchat/middleware/base.py 6 0 0 0 100%
autogen/agentchat/middleware/code_execution.py 108 0 36 0 100%
autogen/agentchat/middleware/llm.py 143 4 62 4 95%
autogen/agentchat/middleware/message_store.py 113 5 60 9 92%
autogen/agentchat/middleware/termination.py 127 22 66 14 79%
autogen/agentchat/middleware/tool_use.py 143 0 54 0 100%
Next steps:
- Utilities for building middleware chain and validating
call(...)method signatures. - Update code-level documentation, remove recommendation for subclassing.
- Write a tutorial on extending
ConversibleAgentusing middleware instead of subclassing.
I'm going to be honest, even as a mid-level developer, I can't understand a word of the example code. Aren't we making AutoGen easy for everyone to use, regardless of their skill level? The basic Autogen code is fairly simple, set up a model, instantiate an agent, initiate that agent chat. Where do these middlewares fit into that process?
Thank you for the valuable feedback! We need to recognize the fact that most autogen users who want to extend agent capabilities will not have strong python backgrounds. This should not prevent us from adopting clean and powerful extension mechanisms, but the machinery needs to be readable, carefully documented, and easy for almost anyone to use, as well as easy to maintain after all of us have moved on. If we require users to become versed in arcane libraries and abstruse programming patterns, those barriers will severely limit autogen's adoption.
I'm going to be honest, even as a mid-level developer, I can't understand a word of the example code. Aren't we making AutoGen easy for everyone to use, regardless of their skill level? The basic Autogen code is fairly simple, set up a model, instantiate an agent, initiate that agent chat. Where do these middlewares fit into that process?
Middleware is meant for the framework developer, not for application developer. It is not changing the interface of AutoGen, rather it is changing the backend and how the "under-the-hood" stuff is written.
You can read the PR description about the motivation.
Currently in AutoGen, each incoming message is handled by a pipeline of registered reply functions. Each reply function is triggered by a trigger function. If a reply function is triggered and signaled it is a final reply, it short-circuits the pipeline and returns the generated reply back to the sender.
This design pattern is described in the AutoGen paper (https://arxiv.org/pdf/2308.08155.pdf, Section 2). This PR is a refactor, we convert reply function into middleware class, so it can better handle states like code executors, message history, etc.