autogen Ignore Some Messages When Transforming

Why are these changes needed?

This PR implements some utilities that can be useful when applying TransformMessages to GroupChatManager. This is also part 1 of 2.

The changes in this PR allow specific messages to be included or excluded during the transformation process based on user-defined criteria.

filter_configs was improved so the user can have finer control when filtering configs. If the user decides to exclude=False (default), then whichever config matches the filter_dict will be returned (the behavior that is currently in main). If exclude=True then the opposite happens; whichever config matches the filter_dict will be excluded from the final result.
Some transforms like TextMessageCompressor and MessageTokenLimiter can use the filter_dict concept present in autogen to ignore some messages from being transformed.

Example:

Let us say we want to ignore all system messages from being compressed, we can pass in a filter_dict={"role": ["system"]}, exclude=True.
How about when we want to compress messages that only come from a specific agent, filter_dict={"name": ["compress_me_agent"]}, exclude=False.

Future Work:

PrefixName transform: Prepends every message with a text template, e.g. "The agent {agent_name} said this:". It is useful when API providers don't use the name key in the message.
FilterMessage: Drops all messages based on the filter_dict you pass in. E.g. we want to drop all system messages from the context.

Related issue number

Checks

[ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
[x] I've added tests (if relevant) corresponding to the changes introduced in this PR.
[x] I've made sure all auto checks have passed.

May 11 '24 16:05 WaelKarkoub

Amazing work @WaelKarkoub, I'll test it out asap :)

May 11 '24 19:05 marklysze

@WaelKarkoub, works really well. I used it in some test code as well as using it in the select speaker process using both role and name based filtering - and it worked as described. This takes the select speaker a good step forward in maintaining context (by keeping system and other key messages as is) while trying to keep context counts down.

Thanks!

May 11 '24 23:05 marklysze

@marklysze I appreciate the help with testing the PR out :pray:!

For PrefixName, that's actually a good insight. For the first iteration of this transform, we can have a template that only includes the sender (e.g. "{sender} said this:"), and then in a future PR we could expand it (e.g. "{sender} said to {receiver}"). Given this new behavior, we should probably rename it to something more generic like PrefixTextMessage. At this moment, I'm not worried about optimizing these transforms, but more of a tool to gauge what users want out of autogen.

May 12 '24 01:05 WaelKarkoub

@marklysze I appreciate the help with testing the PR out :pray:!

For PrefixName, that's actually a good insight. For the first iteration of this transform, we can have a template that only includes the sender (e.g. "{sender} said this:"), and then in a future PR we could expand it (e.g. "{sender} said to {receiver}"). Given this new behavior, we should probably rename it to something more generic like PrefixTextMessage. At this moment, I'm not worried about optimizing these transforms, but more of a tool to gauge what users want out of autogen.

Sounds good, I'd definitely use it for testing out prompts!

May 12 '24 03:05 marklysze

@ekzhu good call! I added tests to cover those code paths. I also added a text compressor mock to allow us to test TextMessageCompressor even when llmlingua is not installed

May 13 '24 15:05 WaelKarkoub

Codecov Report

Attention: Patch coverage is 6.45161% with 29 lines in your changes are missing coverage. Please review.

Project coverage is 16.04%. Comparing base (11d9336) to head (0312d01). Report is 5 commits behind head on main.

Files	Patch %	Lines
...togen/agentchat/contrib/capabilities/transforms.py	0.00%	20 Missing :warning:
autogen/oai/openai_utils.py	18.18%	9 Missing :warning:

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #2661       +/-   ##
===========================================
- Coverage   33.60%   16.04%   -17.56%     
===========================================
  Files          87       87               
  Lines        9336     9359       +23     
  Branches     1987     1992        +5     
===========================================
- Hits         3137     1502     -1635     
- Misses       5933     7803     +1870     
+ Partials      266       54      -212

Flag	Coverage Δ
unittests	`16.04% <6.45%> (-17.56%)`	:arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

May 21 '24 17:05 codecov-commenter