camel icon indicating copy to clipboard operation
camel copied to clipboard

[Enhance] tokenlimit Summarize up to the Last User Message

Open hesamsheikh opened this issue 1 month ago • 4 comments

We want to enhance the tokenlimit context summarizer (#3227) to test out the following hypothesis: Currently, the tokenlimit summarizes the whole context once it reaches a certain threshold. Even though we ask the agent to include a Pending Task so they know where to continue the task after summarization, this could be unreliable because we rely solely on the LLM's understanding. We would like to test whether the continuity would be improved if we summarize the context UP TO THE LAST USER MESSAGE, and then add that user message after the summarization message.

So the context would be:

System Message Context Summary Last User Message

agent must continue from here.

This is a tiny modification, which may have a significant impact in mitigating the loss of context after summarization.

hesamsheikh avatar Nov 04 '25 09:11 hesamsheikh

hi @hesamsheikh thanks for this issue,for now we already add all user's input to Context Summary,so i think the performance loss due to the lack of user input should be minimal?

fengju0213 avatar Nov 05 '25 05:11 fengju0213

so if my understanding is true @fengju0213 , the summarization happens after the user message has been added and before the agent responds to it. It seems a bit counterintuitive to put the last unprocessed message along with the other ones and explain it in the pending task section, and the last user message does not actually need to be summarized as there is no response, and the message itself is not long compared to the other parts of the context. so this actually calls for a challenge rather than solving one.

another thing that is not intuitive is that we don't have any user message after the summarization, it is: System message → Assistant (Summary) → Assistant (Continue the task)

while a more natural flow could be System message → Assistant (Summary—up to last user message) → User (last message)

so not only is this more natural to the developers, we wouldn't need to rely on the LLM's reasoning to continue the tasks naturally, and maybe we can even remove the 'pending task' section of the summary. I think it's worth a shot.

What do you think?

hesamsheikh avatar Nov 05 '25 09:11 hesamsheikh

so if my understanding is true @fengju0213 , the summarization happens after the user message has been added and before the agent responds to it. It seems a bit counterintuitive to put the last unprocessed message along with the other ones and explain it in the pending task section, and the last user message does not actually need to be summarized as there is no response, and the message itself is not long compared to the other parts of the context. so this actually calls for a challenge rather than solving one.

another thing that is not intuitive is that we don't have any user message after the summarization, it is: System message → Assistant (Summary) → Assistant (Continue the task)

while a more natural flow could be System message → Assistant (Summary—up to last user message) → User (last message)

so not only is this more natural to the developers, we wouldn't need to rely on the LLM's reasoning to continue the tasks naturally, and maybe we can even remove the 'pending task' section of the summary. I think it's worth a shot.

What do you think?

hi,@hesamsheikh there are two scenarios. First, if the user's input causes the token to exceed the limit, the user's input will be treated as the latest message. Second, if the user's input is interrupted by a toolcall during agent operation, the user input will be summed.

fengju0213 avatar Nov 10 '25 04:11 fengju0213

thanks for the answer @fengju0213 would be great if we could also have @Wendong-Fan 's input on this

hesamsheikh avatar Nov 10 '25 13:11 hesamsheikh