OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

[Bug]: Endless “CondensationAction” Loop Caused by Constant Context Overflow

Open n3roGit opened this issue 6 months ago • 1 comments

Is there an existing issue for the same bug? (If one exists, thumbs up or comment on the issue instead).

  • [x] I have checked the existing issues.

Describe the bug and reproduction steps

Description
When running the OpenHands resolver as a GitHub Action, the internal context window is repeatedly flagged as “full,” triggering automatic history condensation (CondensationAction) over and over. The patch step (e.g. writing “HALLO WELT” to README.md) is never reached and the job never completes.

To Reproduce

  1. Label an issue or post a comment to trigger the resolver.
  2. Observe the GitHub Action log:

Context window exceeded. Keeping events with IDs: {0, 1, 2, 3, …}
CondensationAction(action=\<ActionType.CONDENSATION: 'condensation'>, …)

  1. Notice that the resolver never applies the patch to README.md and never exits.

Expected Behavior
After at most one condensation step, the resolver should apply the patch and then emit a finish action to terminate the job.

Actual Behavior
The resolver continuously condenses its context history and never proceeds to make any edits or finish, resulting in an infinite loop.

Log Snippet


21:21:40 - openhands\:INFO: agent\_controller.py:1161 - Context window exceeded. Keeping events with IDs: {0, 1, 2, 3}
21:21:40 - openhands\:INFO: resolve\_issue.py:381 - CondensationAction(action=\<ActionType.CONDENSATION: 'condensation'>, …)
21:22:00 - openhands\:INFO: agent\_controller.py:1161 - Context window exceeded. Keeping events with IDs: {0, 1, 2, 3, 5}
21:22:00 - openhands\:INFO: resolve\_issue.py:381 - CondensationAction(action=\<ActionType.CONDENSATION: 'condensation'>, …)
… (repeats indefinitely) …

Environment:

  • OpenHands Resolver v0.39.0 (container ghcr.io/all-hands-ai/runtime:0.39.0-nikolaik)
  • GitHub Action workflow: All-Hands-AI/OpenHands/.github/workflows/openhands-resolver.yml@main
  • Runner: ubuntu-latest
  • max_iterations: 5
  • LLM_MODEL: gpt-4o

Additional Context
This issue is discussed in #6357. A manual workaround is to pass --no-condense, but the current workflow does not support this flag by default. ```



### OpenHands Installation

GitHub resolver

### OpenHands Version

0.39

### Operating System

Linux

### Logs, Errors, Screenshots, and Additional Context

https://productionresultssa19.blob.core.windows.net/actions-results/1656be04-c726-4d04-b83c-b03e8f968207/workflow-job-run-6d8c4aeb-32b1-5144-9c52-4f41405593e5/logs/job/job-logs.txt?rsct=text%2Fplain&se=2025-05-22T06%3A15%3A24Z&sig=idd2dLwDYIshQtC0ghzVeDQ6uBoa1jYa9reYX9xHivw%3D&ske=2025-05-22T14%3A12%3A55Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2025-05-22T02%3A12%3A55Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2025-05-05&sp=r&spr=https&sr=b&st=2025-05-22T06%3A05%3A19Z&sv=2025-05-05

n3roGit avatar May 22 '25 06:05 n3roGit

CC @csmith49 or @malhotra5 not sure which one of you would be best for this.

mamoodi avatar May 22 '25 14:05 mamoodi

Worth noting the linked issue (https://github.com/All-Hands-AI/OpenHands/issues/6357) is likely resolved with the context management changes from https://github.com/All-Hands-AI/OpenHands/pull/7578 and https://github.com/All-Hands-AI/OpenHands/pull/7353. I expect this looping behavior is a different beast entirely.

One key detail here is that the condensation is being triggered from the agent controller:

21:21:40 - openhands\:INFO: agent\_controller.py:1161 - Context window exceeded. Keeping events with IDs: {0, 1, 2, 3}

This happens whenever the messages we send to the LLM can't fit into the context window (as opposed to the other condensation actions which trigger based on, e.g., events in the stream). But we've also tuned that condensation to avoid dropping too many messages at the beginning to avoid critical loss of context, so it's likely the case that 1) those initial messages are enough to exceed the LLM's context window, 2) we deliberately keep those while condensing, hence 3) we loop.

@n3roGit any thoughts on this hypothesis? Have you observed this when triggering the resolver with very little context?

It should be possible to detect when this kind of condensation loop happens, but I'm not sure how that will fit into the resolver workflow. @malhotra5 if we detect this kind of failure is it possible to try running the agent again with condensation disabled?

csmith49 avatar May 30 '25 16:05 csmith49

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jun 30 '25 02:06 github-actions[bot]

This issue was closed because it has been stalled for over 30 days with no activity.

github-actions[bot] avatar Jul 07 '25 02:07 github-actions[bot]