Ignore Some Messages When Transforming
Why are these changes needed?
This PR implements some utilities that can be useful when applying TransformMessages to GroupChatManager. This is also part 1 of 2.
The changes in this PR allow specific messages to be included or excluded during the transformation process based on user-defined criteria.
-
filter_configswas improved so the user can have finer control when filtering configs. If the user decides toexclude=False(default), then whichever config matches thefilter_dictwill be returned (the behavior that is currently in main). Ifexclude=Truethen the opposite happens; whichever config matches thefilter_dictwill be excluded from the final result. -
Some transforms like
TextMessageCompressorandMessageTokenLimitercan use thefilter_dictconcept present in autogen to ignore some messages from being transformed.
Example:
- Let us say we want to ignore all system messages from being compressed, we can pass in a
filter_dict={"role": ["system"]}, exclude=True. - How about when we want to compress messages that only come from a specific agent,
filter_dict={"name": ["compress_me_agent"]}, exclude=False.
Future Work:
-
PrefixNametransform: Prepends every message with a text template, e.g. "The agent {agent_name} said this:". It is useful when API providers don't use thenamekey in the message. -
FilterMessage: Drops all messages based on thefilter_dictyou pass in. E.g. we want to drop all system messages from the context.
Related issue number
Checks
- [ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
- [x] I've added tests (if relevant) corresponding to the changes introduced in this PR.
- [x] I've made sure all auto checks have passed.
Amazing work @WaelKarkoub, I'll test it out asap :)
@WaelKarkoub, works really well. I used it in some test code as well as using it in the select speaker process using both role and name based filtering - and it worked as described. This takes the select speaker a good step forward in maintaining context (by keeping system and other key messages as is) while trying to keep context counts down.
Thanks!
@marklysze I appreciate the help with testing the PR out :pray:!
For PrefixName, that's actually a good insight. For the first iteration of this transform, we can have a template that only includes the sender (e.g. "{sender} said this:"), and then in a future PR we could expand it (e.g. "{sender} said to {receiver}"). Given this new behavior, we should probably rename it to something more generic like PrefixTextMessage. At this moment, I'm not worried about optimizing these transforms, but more of a tool to gauge what users want out of autogen.
@marklysze I appreciate the help with testing the PR out :pray:!
For
PrefixName, that's actually a good insight. For the first iteration of this transform, we can have a template that only includes the sender (e.g. "{sender} said this:"), and then in a future PR we could expand it (e.g. "{sender} said to {receiver}"). Given this new behavior, we should probably rename it to something more generic likePrefixTextMessage. At this moment, I'm not worried about optimizing these transforms, but more of a tool to gauge what users want out of autogen.
Sounds good, I'd definitely use it for testing out prompts!
@ekzhu good call! I added tests to cover those code paths. I also added a text compressor mock to allow us to test TextMessageCompressor even when llmlingua is not installed
Codecov Report
Attention: Patch coverage is 6.45161% with 29 lines in your changes are missing coverage. Please review.
Project coverage is 16.04%. Comparing base (
11d9336) to head (0312d01). Report is 5 commits behind head on main.
| Files | Patch % | Lines |
|---|---|---|
| ...togen/agentchat/contrib/capabilities/transforms.py | 0.00% | 20 Missing :warning: |
| autogen/oai/openai_utils.py | 18.18% | 9 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #2661 +/- ##
===========================================
- Coverage 33.60% 16.04% -17.56%
===========================================
Files 87 87
Lines 9336 9359 +23
Branches 1987 1992 +5
===========================================
- Hits 3137 1502 -1635
- Misses 5933 7803 +1870
+ Partials 266 54 -212
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 16.04% <6.45%> (-17.56%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.