camel
camel copied to clipboard
[Feature Request] Not throwing exception on hitting context limit
Required prerequisites
- [X] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
- [X] Consider asking first in a Discussion.
Motivation
At the moment the Chat Agent throws an exception when the context window size is reached. This is not good and make the RolePlaying lack persistence. A user may want the chat to continue even if the accuracy will degrade (if at all) even for the small models with 512/2048 token context. The current implementation with message_window_size
looks like a half measure and does not guarantee the agent not throwing an exception on token limit hit, if one or some of the messages are very big. @lightaime
Solution
Could be one of the two:
- Reverse filling the context from the most recent messages to the oldest until the context is full. Throwing away the oldest messages in the history.
- Smart and adaptive compression of the old messages by chopping off the tail part of a message the older the message is. Alternatively cutting out the middle of the message. For example, filling in the first half of the context with full messages, then truncating messages twice to fill the next quarter, then truncating 4 times to fill the next 1/8th etc.