autogen
autogen copied to clipboard
Fix #2643 - groupchat model registration
Why are these changes needed?
When using a custom model client with GroupChatManager, the underlying GroupChat internally defines two ConversableAgents that receive the llm_config passed to GroupChatManager but don't handle model registration (and there's no way to do so externally if not subclassing GroupChat and overriding the two methods that do so). This contribution allows a user to pass a ModelClient or a list of ModelClient through a 'model_client_cls' attribute used for automatic registration of custom models only.
Related issue number
Closes #2643
Checks
- [x] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
- [x] I've added tests (if relevant) corresponding to the changes introduced in this PR.
- [x] I've made sure all auto checks have passed.
@microsoft-github-policy-service agree
Codecov Report
Attention: Patch coverage is 18.91892%
with 30 lines
in your changes missing coverage. Please review.
Project coverage is 20.41%. Comparing base (
6279247
) to head (40131dc
). Report is 107 commits behind head on main.
Files | Patch % | Lines |
---|---|---|
autogen/agentchat/groupchat.py | 18.91% | 30 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #2696 +/- ##
===========================================
- Coverage 32.90% 20.41% -12.49%
===========================================
Files 94 95 +1
Lines 10235 10701 +466
Branches 2193 2294 +101
===========================================
- Hits 3368 2185 -1183
- Misses 6580 8361 +1781
+ Partials 287 155 -132
Flag | Coverage Δ | |
---|---|---|
unittests | 20.40% <18.91%> (-12.50%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@ekzhu @sonichi @marklysze I completed all the changes you requested but after merging main several builds are failing because of a missing package (namely "packaging").
@ekzhu I see you submitted two commits before because of this bug on requests.
I'm pretty sure that as for the latter, this problem with the "packaging" library is not related to my code. Could you maybe take a look at your earliest convenience?
Do you think after solving this the PR could be approved?
@Matteo-Frattaroli, thanks for the hard work on this! Your code has come at just the right time as I'm testing a new custom client class for Mistral and I wanted to test Group Chat :).
My first run worked with your code as is, which is great... I'll test more...
@marklysze thank you for the positive feedback!
As I'm sure you noticed though there is some problem with the CI which is why I rolled back the latest merge from master to try and isolate the problem locally. Once this is fixed I'll resolve the conflicts (which is really just about a method i extracted to create the agents) and maybe ask for a new review.
Meanwhile if you have any directions on the current issue I can try to help (even if I know that you're probably already looking into it and are certainly more capable of doing so than me)
@marklysze thank you for the positive feedback!
As I'm sure you noticed though there is some problem with the CI which is why I rolled back the latest merge from master to try and isolate the problem locally. Once this is fixed I'll resolve the conflicts (which is really just about a method i extracted to create the agents) and maybe ask for a new review.
Meanwhile if you have any directions on the current issue I can try to help (even if I know that you're probably already looking into it and are certainly more capable of doing so than me)
Ah, okay.... do you have any output of the issues, I just see the current openai ones which I think are normal as they require approval before they can be run (I believe)...
@marklysze thank you for the positive feedback! As I'm sure you noticed though there is some problem with the CI which is why I rolled back the latest merge from master to try and isolate the problem locally. Once this is fixed I'll resolve the conflicts (which is really just about a method i extracted to create the agents) and maybe ask for a new review. Meanwhile if you have any directions on the current issue I can try to help (even if I know that you're probably already looking into it and are certainly more capable of doing so than me)
Ah, okay.... do you have any output of the issues, I just see the current openai ones which I think are normal as they require approval before they can be run (I believe)...
Hi mark, everything seems okay here because I reverted the latest merge of master which includes the problematic commit. Check out this pipeline: as you can see the issue is related to a missing dependency (packaging)
Hi mark, everything seems okay here because I reverted the latest merge of master which includes the problematic commit. Check out this pipeline: as you can see the issue is related to a missing dependency (packaging)
Ah, I see... perhaps that is a GitHub issue? Maybe one of the core team members can comment as I'm not familiar with that side of things, unfortunately...
⚠️ GitGuardian has uncovered 3 secrets following the scan of your pull request.
Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.
Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard. Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.
🔎 Detected hardcoded secrets in your pull request
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
10493810 | Triggered | Generic Password | 97fa339ac749b3c7d51f1b0fae156d41c02b214e | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 97fa339ac749b3c7d51f1b0fae156d41c02b214e | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 97fa339ac749b3c7d51f1b0fae156d41c02b214e | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
There is one approach to solve the problem in a more fundamental way: Move the register_model_client
out of the agents. Instead, make it a function to manipulate configs. Then, pass the materialized configs to agents so that the agents never need to worry about register_model_client
.
We can implement it in a different PR, if you wish. If we merge this PR now, some code in this PR will be removed though.
There is one approach to solve the problem in a more fundamental way: Move the
register_model_client
out of the agents. Instead, make it a function to manipulate configs. Then, pass the materialized configs to agents so that the agents never need to worry aboutregister_model_client
. We can implement it in a different PR, if you wish. If we merge this PR now, some code in this PR will be removed though.
Hi @sonichi, that's clearly a solution but it would affect several parts of the codebase and lead to a breaking change.
Since what you're proposing is a pretty massive intervention, if meanwhile you plan on merging this PR it would be nice, at least to have the possibility of using custom models with group chats while the solution you proposed is being implemented.
@ekzhu if that's the case I'd ask you to close the change request related to the unit test, as it was implemented.
Hello again @sonichi - I was reflecting on your idea during my lunch break and looking at the code I'm not 100% sure that it would completely resolve this problem if we want to preserve the actual functionality.
If I understood your idea correctly you're proposing of just using a config which would then handle registration automatically or at least to convert register_model_client
from a method to a static independent function.
As of now though register_model_client
requires some instance attributes (namely the list of _clients
available for the OpenAIWrapper
used by an agent) to check that the model client class that we're trying to register is indeed present in the config and create an instance of the same custom model client in the _clients
attribute itself.
We would then still need to somehow get access to the agents that are created internally in the GroupChat to answer the question "for which agent am i registering this model client?". Differently the only option would be to limit the freedom of choice and register a certain model client for all the agent of a give type. Otherwise it could be possible to reimplement the constructor of all the existing agents to handle registration given only the config (i imagine in some sort of automatic fancy, as I've done here).
Am I missing something? Please let me know if you had a different idea in mind
When can we expect this PR to be merged? Group chat with "auto" speaker selection does not work without this. :(
@Matteo-Frattaroli and @sourabhXIII, I'll come back to this one and have a look through and test. We have a roadmap that may benefit from this change as well.
Hi all - subject to some further testing that I'll do today. I'd like to proceed with this change as designed. I think a more fundamental change to the registration of client classes on agents is good, but in the essence of timeliness and impact, if we can move forward with this I think it will help the community in the short term.
When can we expect this PR to be merged? Group chat with "auto" speaker selection does not work without this. :(
@sourabhXIII and @Matteo-Frattaroli, I'm testing at the moment and will suggest a few tweaks to it. I wanted to ask:
- Do we need the
llm_config
parameter if themodel_client_cls
instance contains a config already? - Do you foresee using both the
llm_config
andmodel_client_cls
at once or independently?
I want to test the permutations and existing code to minimise the chance of breaking something.
A few changes I will suggest:
- Cater for those already using group chats by supporting the current selector option (leave
selector
in there as parameters and use it unlessllm_config
/model_client_cls
is passed in). - Prepending
select_speaker_auto_
to the parameter/attribute names so that it's clear they are for that component and names are consistent with other attributes.select_speaker_auto_llm_config
andselect_speaker_auto_model_client_cls
.
When can we expect this PR to be merged? Group chat with "auto" speaker selection does not work without this. :(
@sourabhXIII and @Matteo-Frattaroli, I'm testing at the moment and will suggest a few tweaks to it. I wanted to ask:
1. Do we need the `llm_config` parameter if the `model_client_cls` instance contains a config already? 2. Do you foresee using both the `llm_config` and `model_client_cls` at once or independently?
I want to test the permutations and existing code to minimise the chance of breaking something.
A few changes I will suggest:
1. Cater for those already using group chats by supporting the current selector option (leave `selector` in there as parameters and use it unless `llm_config`/`model_client_cls` is passed in). 2. Prepending `select_speaker_auto_` to the parameter/attribute names so that it's clear they are for that component and names are consistent with other attributes. `select_speaker_auto_llm_config` and `select_speaker_auto_model_client_cls`.
Hi mark, about your questions: I'll check again and get back to you. The changes you suggest instead seem certainly reasonable and I'll try to get at it during the weekend.
Hi mark, about your questions: I'll check again and get back to you. The changes you suggest instead seem certainly reasonable and I'll try to get at it during the weekend.
Thanks @Matteo-Frattaroli!
Hi @marklysze Thank you for investigating this ask.
- Do we need the llm_config parameter if the model_client_cls instance contains a config already?
- Do you foresee using both the llm_config and model_client_cls at once or independently?
I do not see any strong reason to use both llm_config
and model_client_cls
, since register_model_client()
anyway accepts kwargs and passes on to the create()
of model_client_cls
instance.
One soft reason to allow both I can think of though:
Often organisations set up an LLM-gateway layer to communicate with any LLM by their features. That gateway accepts params similar to openai. So, keeping those params just like openai params in llm_config
makes sense. While the model_client_cls
just makes call to their own LLM-gateway and wraps the output in ModelClientResponseProtocol
.
Hi @marklysze, I'm sorry that I couldn't get back to you before. I was just checking the code again and about the points you raised I'd say that passing in both an llm_config
and a custom_model_cls
is necessary for the following reasons:
- the
ConversableAgent
instances are created internally byGroupChat
and these parameters are both need to call thereigster_model_client
method - if it's true that the custom class implemented over
ModelClient
protocol uses the config internally, we can't pass an instance of this class to the GroupChat, but we should rather pass the class itself. Alsoregister_model_client
requires some attributes (namely the list of_clients
available for theOpenAIWrapper
used by an agent) to check that the model client class that we're trying to register is indeed present in the config and create an instance of the same custom model client in the_clients
attribute itself.
Summarizing using only model_client_cls
would require some heavy modification on the existing ConversableAgent
class that, as previously discussed with @sonichi, would change the registration mechanism for all the agents of this type even when not using a groupchat.
For example, running the following code:
agent2 = ConversableAgent(
name="Agent 2",
#llm_config=llm_config
)
agent2.register_model_client(model_client_cls=CustomModelClient)
would result in:
AttributeError: 'NoneType' object has no attribute 'register_model_client'
Or similarly, running:
agent2 = ConversableAgent(
name="Agent 2",
llm_config=llm_config
)
#agent2.register_model_client(model_client_cls=CustomModelClient)
would result in:
RuntimeError: Model client(s) ['CustomModelClient'] are not activated. Please register the custom model clients using `register_model_client` or filter them out form the config list.
Anyways I kindly ask you to tell me if you think I'm missing something or you see some simpler way to do this. Also I ran the tests (both existing and the one I added to test the correct functionality of the new implementation) and I don't think there should any breaking change (overall because both the parameters are optional).
Finally, about this:
A few changes I will suggest:
1. Cater for those already using group chats by supporting the current selector option (leave `selector` in there as parameters and use it unless `llm_config`/`model_client_cls` is passed in). 2. Prepending `select_speaker_auto_` to the parameter/attribute names so that it's clear they are for that component and names are consistent with other attributes. `select_speaker_auto_llm_config` and `select_speaker_auto_model_client_cls`.
I've implemented and committed the changes you requested.
Thank you!
Thanks for your feedback and code changes, @Matteo-Frattaroli, sorry it took me a few days as well... I'll run through the changes and provide some feedback.
@Matteo-Frattaroli, nice work with the changes - I've tested with a custom model client class for the select speaker auto-mode inner conversation and it worked a treat :).
For the note and discussion around having both llm_config
and model_client_cls
, in my testing it did make sense to have both with llm_config
in use to remain consistent with the way AutoGen handles configs for agents in general. @sourabhXIII made a good point about the kwargs, and perhaps a future change could accommodate having the kwargs provide the parameters for the custom model client class, but for now I think having both is consistent and okay.
I've made a couple of comments above, minor changes (though the documentation note will be important to help devs).
Also, the changes to the test file passed, @ekzhu not sure if you would prefer mock being used. I've added a small suggested change to align better with the code in groupchat.
hi, thank you so much for your effort. i'm currently have issue when register custom model class and waiting to this PR to be merged. i've do some test with this PR and got some issue, can you please check it out:
- Got runtime error when using async chat since
selector
param has been removed froma_select_speaker
fuction buta_run_chat
function still passing the param - Unable to pass kwargs when register custom model because in
_register_client_from_config
function, you calling register function without passing any param:agent.register_model_client(select_speaker_auto_model_client_cls)
. i think you can add a additional param when init grouchat (egselect_speaker_auto_model_client_cls_kwarg
) and then pass it to register model function as kwarg i'm supper new to this project so not sure if i'm do it right and thank you again for your work here
hi, thank you so much for your effort. i'm currently have issue when register custom model class and waiting to this PR to be merged. i've do some test with this PR and got some issue, can you please check it out:
- Got runtime error when using async chat since
selector
param has been removed froma_select_speaker
fuction buta_run_chat
function still passing the param- Unable to pass kwargs when register custom model because in
_register_client_from_config
function, you calling register function without passing any param:agent.register_model_client(select_speaker_auto_model_client_cls)
. i think you can add a additional param when init grouchat (egselect_speaker_auto_model_client_cls_kwarg
) and then pass it to register model function as kwarg i'm supper new to this project so not sure if i'm do it right and thank you again for your work here
Hey @thaind-taureau, thanks for your comment...
For #1, I've updated the async select speaker, a_select_speaker
, are you able to test that, now?
For #2, can you show some code that you'd like to see so we can see if there's a way around it?
Hi @marklysze
- im going to test it now
- this is the workaround i'm currently using. i just add
model_client_cls_kwargs
as dictionary and then override the last line of_register_client_from_config
function:
class CustomGroupChat(GroupChat):
def __init__(
self,
model_client_cls_kwargs: Optional[Dict] = None, # Add this parameter
*args, **kwargs
):
super().__init__(*args, **kwargs)
self.model_client_cls_kwargs = model_client_cls_kwargs
# override _register_client_from_config
def _register_client_from_config(self, agent: Agent, config: Dict):
# change the last line of function, keep everything else
...
agent.register_model_client(select_speaker_auto_model_client_cls, **self.model_client_cls_kwargs) # I change this line
@marklysze Update for #1: tested the code and it's work. thank you
Hi @marklysze
- im going to test it now
- this is the workaround i'm currently using. i just add
model_client_cls_kwargs
as dictionary and then override the last line of_register_client_from_config
function:class CustomGroupChat(GroupChat): def __init__( self, model_client_cls_kwargs: Optional[Dict] = None, # Add this parameter *args, **kwargs ): super().__init__(*args, **kwargs) self.model_client_cls_kwargs = model_client_cls_kwargs # override _register_client_from_config def _register_client_from_config(self, agent: Agent, config: Dict): # change the last line of function, keep everything else ... agent.register_model_client(select_speaker_auto_model_client_cls, **self.model_client_cls_kwargs) # I change this line
Thanks for testing number 1 @thaind-taureau, I'm glad that it worked.
For number 2, are you able to provide an example of the kwargs you're passing in?
Add a unit test with mocks to check if the select speaker agent has gotten the propagated custom client.
Hey @ekzhu - the test has been added, I've corrected the formatting
This PR is against AutoGen 0.2. AutoGen 0.2 has been moved to the 0.2 branch. Please rebase your PR on the 0.2 branch or update it to work with the new AutoGen 0.4 that is now in main.
Hi @marklysze
- im going to test it now
- this is the workaround i'm currently using. i just add
model_client_cls_kwargs
as dictionary and then override the last line of_register_client_from_config
function:class CustomGroupChat(GroupChat): def __init__( self, model_client_cls_kwargs: Optional[Dict] = None, # Add this parameter *args, **kwargs ): super().__init__(*args, **kwargs) self.model_client_cls_kwargs = model_client_cls_kwargs # override _register_client_from_config def _register_client_from_config(self, agent: Agent, config: Dict): # change the last line of function, keep everything else ... agent.register_model_client(select_speaker_auto_model_client_cls, **self.model_client_cls_kwargs) # I change this line
Thanks for testing number 1 @thaind-taureau, I'm glad that it worked.
For number 2, are you able to provide an example of the kwargs you're passing in?
Hi @marklysze. Sorry for the late, but i'm not sure what you want to see. Please tell me if this not answer your question. Currently when using this workaround, i'm just passing a dictionary that conatin some extra object because my custom model class require some extra work , something like this:
model_client_cls_kwargs = {
"user_id": user.id,
"model": llm_model,
"db_session": db_session
}
group_chat = CustomGroupChat(
model_client_cls_kwargs=model_client_cls_kwargs
...
)