OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

Redesign docker sandbox

Open rbren opened this issue 1 year ago • 6 comments

What problem or use case are you trying to solve? We're using exec_run to run commands in the sandbox. This isn't stateful, and doesn't handle CLI interactions via stdin very well.

Things we struggle with today:

  • We don't keep track of cd commands
  • The agent can't interact with stdin (e.g. it runs apt-get install without -y, it wants to type y to get through)
    • this is more important if we e.g. ask the agent to develop an interactive CLI that it needs to test
  • Can't use apt-get install in sandbox (due to permissions)
  • kill doesn't work

Describe the UX of the solution you'd like Something closer to @xingyaoww 's original implementation: https://github.com/xingyaoww/OpenDevin/blob/8815aa95ba770110e9d6a4839fb7f9cef01ef4d7/opendevin/sandbox/docker.py

Do you have thoughts on the technical implementation? Can we start the container, then connect an ssh or pty session?

Describe alternatives you've considered

  • Hacking around exec 👎

rbren avatar Mar 26 '24 21:03 rbren

Here's a suggestion from Slack: https://github.com/princeton-nlp/intercode

Maybe not quite the API we need, but we can take some inspiration from them at least

rbren avatar Mar 26 '24 21:03 rbren

How do you feel about we do docker attach then uses my old implementation to read pty of that session? This can potentially be the "main session" that keeps track of cd command.

xingyaoww avatar Mar 27 '24 05:03 xingyaoww

@xingyaoww that should probably work well

rbren avatar Mar 27 '24 13:03 rbren

I would suggest enabling sshd inside the docker sandbox, and connect to the docker environment via ssh. Then we need to capture the TTY of that ssh session inside Python.

To do that, there're some python libraries that enable these kind of interactive session. Such as Pexpect (https://pexpect.readthedocs.io/en/stable/), and more specifically https://pexpect.readthedocs.io/en/stable/api/pxssh.html . The other alternative could be https://github.com/pexpect/ptyprocess.

At the very least, we can deal with interactive cli this way.

cc @neubig for input as well

frankxu2004 avatar Apr 06 '24 22:04 frankxu2004

@frankxu2004 thanks! Conceptually this sounds nice to me.

neubig avatar Apr 07 '24 01:04 neubig

@frankxu2004 I like this one! This allows us to unify the entire "persistence" session and "background" sessions easily without managing all the docker sockets. This could potentially make the sandbox interface more generalizable (e.g., we can easily use other machines as sandbox as long as we can ssh into it). Feel free to PR if interested!

xingyaoww avatar Apr 07 '24 05:04 xingyaoww