OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

[Bug]: AgentStuckInLoopError: Agent got stuck in a loop

Open Geoffrey1014 opened this issue 10 months ago • 12 comments

Is there an existing issue for the same bug?

  • [x] I have checked the existing issues.

Describe the bug and reproduction steps

I run hello world with the following problem: AgentStuckInLoopError: Agent got stuck in a loop

Image Image

OpenHands Installation

Docker command in README

OpenHands Version

No response

Operating System

None

Logs, Errors, Screenshots, and Additional Context

No response

Geoffrey1014 avatar Mar 11 '25 04:03 Geoffrey1014

Hi @Geoffrey1014, thanks for reporting! Which model were you using when you encountered the error?

ryanhoangt avatar Mar 11 '25 05:03 ryanhoangt

Hi @Geoffrey1014, thanks for reporting! Which model were you using when you encountered the error?

I have the same error on Anthropic 3.7 sonnet and haiku + openAI o3-mini-2025. (I only tried these 3 and non of them worked)

06:12:17 - openhands:INFO: manage_conversations.py:138 - Initializing new conversation
06:12:17 - openhands:INFO: manage_conversations.py:48 - Creating conversation
06:12:17 - openhands:INFO: manage_conversations.py:52 - Loading settings
06:12:17 - openhands:INFO: manage_conversations.py:55 - Settings loaded
06:12:17 - openhands:INFO: manage_conversations.py:79 - Loading conversation store
06:12:17 - openhands:INFO: manage_conversations.py:81 - Conversation store loaded
06:12:17 - openhands:INFO: manage_conversations.py:87 - New conversation ID: 5ace
06:12:17 - openhands:INFO: manage_conversations.py:97 - Saving metadata for conversation 5ace
06:12:17 - openhands:INFO: manage_conversations.py:108 - Starting agent loop for conversation 5ace
06:12:17 - openhands:INFO: standalone_conversation_manager.py:247 - maybe_start_agent_loop:5ace
06:12:17 - openhands:INFO: standalone_conversation_manager.py:250 - start_agent_loop:5ace
06:12:17 - openhands:INFO: standalone_conversation_manager.py:296 - _get_event_stream:5ace
06:12:17 - openhands:INFO: standalone_conversation_manager.py:299 - found_local_agent_loop:5ace
06:12:17 - openhands:INFO: manage_conversations.py:126 - Finished initializing conversation 5ace
INFO:     172.17.0.1:56098 - "POST /api/conversations HTTP/1.1" 200 OK
INFO:     172.17.0.1:56098 - "GET /api/conversations/5ace HTTP/1.1" 200 OK
INFO:     ('172.17.0.1', 56102) - "WebSocket /socket.io/?latest_event_id=-1&conversation_id=5ace&EIO=4&transport=websocket" [accepted]
06:12:18 - openhands:INFO: listen_socket.py:28 - sio:connect: LtrPWuT7GLtSh2TvAAAH
06:12:18 - openhands:INFO: standalone_conversation_manager.py:111 - join_conversation:5ace:LtrPWuT7GLtSh2TvAAAH
06:12:18 - openhands:INFO: standalone_conversation_manager.py:296 - _get_event_stream:5ace
06:12:18 - openhands:INFO: standalone_conversation_manager.py:299 - found_local_agent_loop:5ace
06:12:18 - openhands:INFO: docker_runtime.py:140 - [runtime 5ace] Starting runtime with image: docker.all-hands.dev/all-hands-ai/runtime:0.28-nikolaik
06:12:19 - openhands:INFO: docker_runtime.py:144 - [runtime 5ace] Container started: openhands-runtime-5ace. VSCode URL: None
06:12:19 - openhands:INFO: docker_runtime.py:155 - [runtime 5ace] Waiting for client to become ready at http://host.docker.internal:31483...
06:12:41 - openhands:INFO: docker_runtime.py:161 - [runtime 5ace] Runtime is ready.
06:12:42 - openhands:INFO: base.py:318 - [runtime 5ace] Selected repo: None, loading microagents from /workspace/.openhands/microagents (inside runtime)
06:12:42 - openhands:INFO: prompt.py:95 - Loading microagents: []
06:12:42 - openhands:INFO: agent_session.py:151 - Agent session start
06:12:42 - USER_ACTION
[Agent Controller 5ace] **MessageAction** (source=EventSource.USER)
CONTENT: I want to create a VueJS app that allows me to:
* See all the items on my todo list
* add a new item to the list
* mark an item as done
* totally remove an item from the list
* change the text of an item
* set a due date on the item

This should be a client-only app with no backend. The list should persist in localStorage.
06:12:42 - openhands:INFO: agent_controller.py:471 - [Agent Controller 5ace] Setting agent(CodeActAgent) state from AgentState.LOADING to AgentState.RUNNING


==============
[Agent Controller 5ace] LEVEL 0 LOCAL STEP 0 GLOBAL STEP 0

06:12:43 - openhands:INFO: standalone_conversation_manager.py:102 - Conversation 5ace connected in 0.0318913459777832 seconds
06:12:43 - openhands:INFO: standalone_conversation_manager.py:76 - Reusing active conversation 5ace
INFO:     172.17.0.1:36140 - "GET /api/conversations/5ace/vscode-url HTTP/1.1" 200 OK
INFO:     172.17.0.1:36136 - "GET /api/conversations/5ace/list-files HTTP/1.1" 200 OK
06:12:43 - openhands:INFO: agent_controller.py:471 - [Agent Controller 5ace] Setting agent(CodeActAgent) state from AgentState.RUNNING to running
06:12:43 - OBSERVATION
[Agent Controller 5ace] AgentStateChangedObservation(content='', agent_state='running', observation='agent_state_changed')
06:12:43 - OBSERVATION
[Agent Controller 5ace] AgentCondensationObservation(content='Trimming prompt to meet context window limitations', observation='condense')


==============
[Agent Controller 5ace] LEVEL 0 LOCAL STEP 1 GLOBAL STEP 1

06:12:43 - OBSERVATION
[Agent Controller 5ace] AgentCondensationObservation(content='Trimming prompt to meet context window limitations', observation='condense')


==============
[Agent Controller 5ace] LEVEL 0 LOCAL STEP 2 GLOBAL STEP 2

06:12:44 - OBSERVATION
[Agent Controller 5ace] AgentCondensationObservation(content='Trimming prompt to meet context window limitations', observation='condense')


==============
[Agent Controller 5ace] LEVEL 0 LOCAL STEP 3 GLOBAL STEP 3

06:12:44 - openhands:WARNING: stuck.py:356 - Context window error loop detected - repeated condensation events
06:12:44 - openhands:INFO: agent_controller.py:471 - [Agent Controller 5ace] Setting agent(CodeActAgent) state from AgentState.RUNNING to AgentState.ERROR
06:12:44 - openhands:INFO: session.py:200 - Agent status error
06:12:44 - openhands:INFO: agent_controller.py:471 - [Agent Controller 5ace] Setting agent(CodeActAgent) state from AgentState.ERROR to AgentState.ERROR
06:12:44 - openhands:INFO: session.py:259 - Agent status error
06:12:44 - OBSERVATION
[Agent Controller 5ace] AgentStateChangedObservation(content='', agent_state='error', observation='agent_state_changed')

ShahabSotouni avatar Mar 11 '25 05:03 ShahabSotouni

Can confirm this on 4o-mini as well as 4o on release 0.28.1, downgraded to 0.28.0 and things are working again. Wondering if it could be related to #7132 💭

https://github.com/All-Hands-AI/OpenHands/pull/7132/files#diff-d207970f6fc85ff139f8d72c59468fb2359ca8e365f35b5de9756c13a03db064R759

gfargo avatar Mar 11 '25 22:03 gfargo

I use gpt-4

Geoffrey1014 avatar Mar 12 '25 00:03 Geoffrey1014

I run into the same issue with all gpt models.

goedzo avatar Mar 13 '25 16:03 goedzo

This is probably the same issue as #7167. In the latest main branch, v0.28. I'm having trouble with all gpt-4o models, but work fine with gpt-4 models. @xingyaoww any idea?

zhiyufan avatar Mar 13 '25 17:03 zhiyufan

I can confirm that downgrading to 0.28.0 solves the problem

rozek avatar Mar 14 '25 08:03 rozek

I can confirm that downgrading to 0.28.0 solves the problem

Can confirm the same, so it it definitely cause by the update.

goedzo avatar Mar 14 '25 10:03 goedzo

cc @enyst 🤔 any idea if #7132 would cause this?

xingyaoww avatar Mar 14 '25 14:03 xingyaoww

🤔 this commit

  • #7132 stops the loop in 3 steps, it doesn't change why the loop happens in the first place
  • it was too aggressive so we made it 10 steps
  • https://github.com/All-Hands-AI/OpenHands/pull/7237

Looking into it, maybe we should revert it completely

enyst avatar Mar 14 '25 14:03 enyst

@xingyaoww Please see: https://github.com/All-Hands-AI/OpenHands/pull/7252#issuecomment-2725003265

Was there a change to tools between 0.28.0 and main so that GPT-4 suddenly has more than 1024 tokens per tool, and so it fails with ContextWindowExceeded?

enyst avatar Mar 14 '25 15:03 enyst

Ahh good catch, @ryanhoangt is working on a hot fix now 🙏

xingyaoww avatar Mar 14 '25 15:03 xingyaoww

@enyst @xingyaoww is this fixed?

neubig avatar Mar 24 '25 19:03 neubig

I think Ryan's hotfix is in, we can close it as fixed.

enyst avatar Mar 24 '25 19:03 enyst

  • use: openhands-lm-32b-v0.1
  • OpenHands==0.41.0
16:16:57 - openhands:WARNING: stuck.py:332 - Action, Observation pattern detected
16:16:57 - openhands:ERROR: loop.py:24 - AgentStuckInLoopError: Agent got stuck in a loop

chansonZ avatar Jun 05 '25 08:06 chansonZ

@chansonZ I believe some models may still get stuck in a loop. @enyst am I mistaken here? Should we just never get into a loop anymore?

mamoodi avatar Jun 05 '25 13:06 mamoodi

So it's like this

  • OH shouldn't get stuck in a loop when it just started (without anything else happening)
  • once the task started normally, the LLM can always get stuck in a loop, it just means it returns the same response, and we don't want to waste tokens and money on useless same response, so we stop it and show the user an error
  • the user can decide to continue, just tell it something. You may want to send it a message that is a bit larger or different or yell at it to take another path or to reflect on what it has done, so that maybe it responds differently, and continues normally.

We can improve this, but I don't know if we can ever "solve" it completely in all circumstances, LLMs are nice black boxes and sometimes they just do that.

enyst avatar Jun 05 '25 14:06 enyst