UI should visualize how much of the context window is used by the agent
What problem or use case are you trying to solve? To better understand how much of the context window is used and how much is available.
Describe the UX of the solution you'd like A UI widget to show "% of context size used"
Additional context LLMs follow instructions better when their context size is small. Seeing how much of the context size is used give us better sense of how to prompt the llm so it uses the context window efficiently.
If you find this feature request or enhancement useful, make sure to add a 👍 to the issue
@openhands please implement this feature request. The modal triggered by the "Display Cost" button should show:
- total input tokens for the conversation
- total output tokens for the conversation
- total input tokens in the most recent prompt
- total output tokens in the most recent prompt
- percent of context window used in most recent prompt
I'm on it! @rbren can track my progress at all-hands.dev
I've successfully implemented the feature request from issue #7554 to enhance the metrics modal with context window usage information. Here's a summary of the changes made:
Changes Implemented
-
Enhanced Metrics State:
- Added
mostRecentUsageandmodelNamefields to the metrics state - Created a mapping of model names to their context window sizes
- Added
-
Updated Actions Handler:
- Modified the action handler to include token usage history and model name in the metrics data
-
Enhanced Metrics Modal UI:
- Added a new section for "Most Recent Prompt" showing:
- Total input tokens for the most recent prompt
- Total output tokens for the most recent prompt
- Total tokens in the most recent prompt
- Added context window usage visualization:
- Percentage of context window used
- Visual progress bar showing the usage
- Added a new section for "Most Recent Prompt" showing:
-
Added Tests:
- Created comprehensive tests for the new metrics display functionality
- Verified that all metrics are displayed correctly
Results
The implementation successfully addresses all requirements from the feature request:
- Shows total input tokens for the conversation
- Shows total output tokens for the conversation
- Shows total input tokens in the most recent prompt
- Shows total output tokens in the most recent prompt
- Shows percent of context window used in most recent prompt
I created a pull request (#7556) that includes all these changes and links back to the original issue. The PR includes appropriate tests and follows the project's coding standards.
@rbren Are you working on this currently or just trying to let openhands-agent do the work? It seems the PR is failed. I have bandwidth after a week. You can assign it to me if you don't prepare to finish it during these period.
Hey @AutoLTX , we're still working on some backend changes that should make this easier: https://github.com/All-Hands-AI/OpenHands/pulls/csmith49
But that should be in very soon, and once it is we'd love some help on the frontend!
Hey @AutoLTX , the backend should be fixed up now, so if you want to take a look at the CondensationAction, we could find ways to display it in the frontend. Would you like to take a look?
ACK. Thanks for remind @neubig. Let me understand the CondensationAction first and take a look at the implementation
Thanks a lot!
@AutoLTX thanks for looking into this one. For the ux, if possible, i think visualizing the % used as a progress bar would be nice so we visually get a sense how much context length is remaining.
look at https://mintlify.s3.us-west-1.amazonaws.com/factory/images/tutorial/step-8.webp on how factory ai is doing it
This is how claude.ai visualize it for projects:
Hi @AutoLTX , just checking in on this issue, have you had a chance to take a look at it?
Hey @neubig. yes, I’ve read the info you provided. I unexpectedly had something urgent come up over the past two weeks and didn't have time to implement it. I promise to submit a PR(at least a draft one with feature enabled, may still need refine) for this issue by the end of this week.
OK, great thank you, much appreciated!
During my implementation, I found @xingyaoww had a merged a PR previously to calculate tokens together rather than use per-request token. I'm wondering whether this will have conflict to condense related implementation from @csmith49.(Just an assumption, I didn't go through the logic yet). At least currently during my local test, the dashboard shows 1 hello world web app cost 1w token. 0.04USD for Claude 3.5 is about 4000-5000 token? I guess this is not correct. 😳 Let me try to investigate the real root cause first. Will go back soon.
PR mentioned: https://github.com/All-Hands-AI/OpenHands/commit/c63d52d5e6a420acff49d00cf5537c7647bf3dca#diff-e727565809472850de398d0642308a6f657451d04aa3921693ebda68a4b569cc
And for @c3-ali Do you think this can meet your requirement?(Ignore the data of % hah, I will soon modify it.)
Hey @AutoLTX - My PR there send the "accumulated total tokens used" to FE instead of individual messages.
Maybe we can also add the "per-turn" token usage back here so we can calculate the context window usage correctly? https://github.com/All-Hands-AI/OpenHands/blob/1c4c477b3f9eb38a664f43cff5b83561e5314166/openhands/controller/agent_controller.py#L1247-L1255
Hey @AutoLTX , this is looking good! If you have any other questions we're happy to help.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
I think this is resolved now?
Yes, I think so! Thanks @AutoLTX