OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

Trajectory replay on web GUI

Open li-boxuan opened this issue 11 months ago • 5 comments

End-user friendly description of the problem this fixes or functionality that this introduces

  • [x] Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Give a summary of what the PR does, explaining any non-trivial design decisions

Support trajectory replay on web GUI.

Screenshot 2025-02-02 at 12 42 20 AM

Link of any specific issues this addresses

#6049


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:44d5fd7-nikolaik   --name openhands-app-44d5fd7   docker.all-hands.dev/all-hands-ai/openhands:44d5fd7

li-boxuan avatar Jan 18 '25 20:01 li-boxuan

Since this is adding new UI elements, I think we'd want a quick look from the designer. CC @amanape so he can coordinate that if possible.

mamoodi avatar Feb 02 '25 18:02 mamoodi

I agree with @mamoodi , and also given that this is advanced functionality that most users will not be using, maybe we can put it somewhere else, for example create an "advanced functionality" button in the sidebar, that starts out by just including this but eventually will house other options that aren't within the main usage path

neubig avatar Feb 03 '25 12:02 neubig

Yeah I agree it's "advanced" functionality that we should somehow hide from normal users.

My primary intention is to be able to replay trajectories from benchmarks for 1) debugging, and 2) demo.

li-boxuan avatar Feb 14 '25 08:02 li-boxuan

I may be an odd bird, I find exciting a use case like:

  • run on the local runtime
  • until a prompt confirmation warning says you shouldn't do it (unexpected installation, red zone, rm -rf / 😅) [1]
  • then press a button to transfer to the remote runtime
  • replay and go

[1] Of course, the warning comes with a voice saying "I'm sorry, Boxuan, I'm afraid I can't do that."

OK, probably not a suitable use case 😅 (what about increasing the runtime resources, does it just work?)

enyst avatar Feb 15 '25 16:02 enyst

I may be an odd bird, I find exciting a use case like:

  • run on the local runtime
  • until a prompt confirmation warning says you shouldn't do it (unexpected installation, red zone, rm -rf / 😅) [1]
  • then press a button to transfer to the remote runtime
  • replay and go

[1] Of course, the warning comes with a voice saying "I'm sorry, Boxuan, I'm afraid I can't do that."

OK, probably not a suitable use case 😅 (what about increasing the runtime resources, does it just work?)

Haha that sounds fun. It’s indeed a potentially very interesting scenario - OpenHands can fail die to hardware resource limits, runtime crash, api token limit… or like you said , halt due to security concern. “transfer to remote runtime and replay” sounds very attractive

li-boxuan avatar Feb 16 '25 05:02 li-boxuan

@rbren were any actions taken on for this?

mamoodi avatar Feb 26 '25 15:02 mamoodi

@li-boxuan can we revert the FE changes so we can just get the backend change in?

I'm also curious if https://trajectory-visualizer.all-hands.dev/ should take the place of this

rbren avatar Mar 19 '25 14:03 rbren

@li-boxuan can we revert the FE changes so we can just get the backend change in?

I'm also curious if https://trajectory-visualizer.all-hands.dev/ should take the place of this

Sure I'll revert the UI changes and keep the functionality.

I'm also curious if https://trajectory-visualizer.all-hands.dev/ should take the place of this

Trajectory replay has two usage:

  1. Visualize what has happened for a session, step by step
  2. Reproduce and optionally, continue - e.g. I may want to test a new micro-agent for a given step, but I don't want to start over my experiment (which is costly and non-deterministic/non-reproducible).

The trajectory visualizer replaces the 1st usage.

li-boxuan avatar Mar 20 '25 04:03 li-boxuan

@rbren @amanape I have removed the UX part. The functionality exists but not accessible to users.

li-boxuan avatar Mar 20 '25 06:03 li-boxuan

Thanks! If we want to have some esoteric key combination trigger the UI for it that's fine too. Or maybe a feature flag https://github.com/All-Hands-AI/OpenHands/blob/d9926d2491384421dc38d3e68720bfb1b486db2e/frontend/src/utils/feature-flags.ts#L4

Edit: feature flag would actually be a great way to enable this for you and other researchers

rbren avatar Mar 21 '25 13:03 rbren

Looks like something has changed recently and broke the replay feature on web app, but not headless mode... will look into it.

li-boxuan avatar Mar 22 '25 06:03 li-boxuan

@amanape would you like to review again? This feature is now disabled by default

li-boxuan avatar Mar 24 '25 05:03 li-boxuan