autogen icon indicating copy to clipboard operation
autogen copied to clipboard

Ignore Some Messages When Transforming

Open WaelKarkoub opened this issue 1 year ago • 5 comments

Why are these changes needed?

This PR implements some utilities that can be useful when applying TransformMessages to GroupChatManager. This is also part 1 of 2.

The changes in this PR allow specific messages to be included or excluded during the transformation process based on user-defined criteria.

  • filter_configs was improved so the user can have finer control when filtering configs. If the user decides to exclude=False (default), then whichever config matches the filter_dict will be returned (the behavior that is currently in main). If exclude=True then the opposite happens; whichever config matches the filter_dict will be excluded from the final result.

  • Some transforms like TextMessageCompressor and MessageTokenLimiter can use the filter_dict concept present in autogen to ignore some messages from being transformed.

Example:

  • Let us say we want to ignore all system messages from being compressed, we can pass in a filter_dict={"role": ["system"]}, exclude=True.
  • How about when we want to compress messages that only come from a specific agent, filter_dict={"name": ["compress_me_agent"]}, exclude=False.

Future Work:

  • PrefixName transform: Prepends every message with a text template, e.g. "The agent {agent_name} said this:". It is useful when API providers don't use the name key in the message.
  • FilterMessage: Drops all messages based on the filter_dict you pass in. E.g. we want to drop all system messages from the context.

Related issue number

Checks

  • [ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
  • [x] I've added tests (if relevant) corresponding to the changes introduced in this PR.
  • [x] I've made sure all auto checks have passed.

WaelKarkoub avatar May 11 '24 16:05 WaelKarkoub

Amazing work @WaelKarkoub, I'll test it out asap :)

marklysze avatar May 11 '24 19:05 marklysze

@WaelKarkoub, works really well. I used it in some test code as well as using it in the select speaker process using both role and name based filtering - and it worked as described. This takes the select speaker a good step forward in maintaining context (by keeping system and other key messages as is) while trying to keep context counts down.

Thanks!

marklysze avatar May 11 '24 23:05 marklysze

@marklysze I appreciate the help with testing the PR out :pray:!

For PrefixName, that's actually a good insight. For the first iteration of this transform, we can have a template that only includes the sender (e.g. "{sender} said this:"), and then in a future PR we could expand it (e.g. "{sender} said to {receiver}"). Given this new behavior, we should probably rename it to something more generic like PrefixTextMessage. At this moment, I'm not worried about optimizing these transforms, but more of a tool to gauge what users want out of autogen.

WaelKarkoub avatar May 12 '24 01:05 WaelKarkoub

@marklysze I appreciate the help with testing the PR out :pray:!

For PrefixName, that's actually a good insight. For the first iteration of this transform, we can have a template that only includes the sender (e.g. "{sender} said this:"), and then in a future PR we could expand it (e.g. "{sender} said to {receiver}"). Given this new behavior, we should probably rename it to something more generic like PrefixTextMessage. At this moment, I'm not worried about optimizing these transforms, but more of a tool to gauge what users want out of autogen.

Sounds good, I'd definitely use it for testing out prompts!

marklysze avatar May 12 '24 03:05 marklysze

@ekzhu good call! I added tests to cover those code paths. I also added a text compressor mock to allow us to test TextMessageCompressor even when llmlingua is not installed

WaelKarkoub avatar May 13 '24 15:05 WaelKarkoub

Codecov Report

Attention: Patch coverage is 6.45161% with 29 lines in your changes are missing coverage. Please review.

Project coverage is 16.04%. Comparing base (11d9336) to head (0312d01). Report is 5 commits behind head on main.

Files Patch % Lines
...togen/agentchat/contrib/capabilities/transforms.py 0.00% 20 Missing :warning:
autogen/oai/openai_utils.py 18.18% 9 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2661       +/-   ##
===========================================
- Coverage   33.60%   16.04%   -17.56%     
===========================================
  Files          87       87               
  Lines        9336     9359       +23     
  Branches     1987     1992        +5     
===========================================
- Hits         3137     1502     -1635     
- Misses       5933     7803     +1870     
+ Partials      266       54      -212     
Flag Coverage Δ
unittests 16.04% <6.45%> (-17.56%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar May 21 '24 17:05 codecov-commenter