OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

What is best way to set up OpenHands container running locally to access compilers in different container?

Open bartlettroscoe opened this issue 6 months ago • 16 comments

What is the best way to set up an OpenHands container running locally to run compilers and run test commands in a separate container? For example, I have a container called 'tril-clang-19' that has clang-19 compilers and build tools in another container and I have my source and build directories mounted in both the OpenHands and the 'tril-clang-19' container. I need to tell OpenHands to test changes to code and generated tests by running object file builds, executable links, and running test executables in the tril-clang-19 container. OpenHands has access to the source code in its container passed in through SANDBOX_VOLUMES=/home/rabartl/Trilinos.base/Trilinos:/workspace:rw, so I can search, analyze, and modify the source code files, but it can't build and run the code on its own.

Gemini 2.5 suggests using docker compose:

  • https://g.co/gemini/share/a8d891557569

Is that the right way to go about doing this? Or, do I need to expose the 'tril-clang-19' container as an MCP tool?

I can provide a detailed example to make things concreate if that would help. I am just hoping that other people have tried something similar.

BTW, this setup was easy with VSCode + GitHub Copilot Agent Mode running the build in the 'tril-clang-19' container. I just told it what commands to run and it did it. (But I can't script VSCode + GitHub Copilot Agent Mode and run a large evaluation test suite so that is why I am trying to use OpenHands.)

bartlettroscoe avatar Jun 12 '25 22:06 bartlettroscoe

Yes, you would need to use MCP or do something custom that's similar. Through whatever means, you'd need to provide the agent with a function, or functions, that it can call or otherwise trigger to do what you want, with the response returned as feedback for the agent. There are various ways that could be done, but MCP is intended to standardize the process. In any case, results will vary depending on the implementation, prompts used, and the model used, so you'll need to test your processes and make tweaks until you get something that works reliably for your use case. I'd be interested in hearing back on how you proceed and can try to help if you have issues.

matty-fortune avatar Jun 14 '25 18:06 matty-fortune

@matty-fortune,

Yes, you would need to use MCP or do something custom that's similar.

10-4.

I have never implemented an MCP function before. From looking at:

  • https://docs.all-hands.dev/usage/mcp#model-context-protocol-mcp

it seems this would likely be a stdio MCP function?

Is there a simple free MCP server you would recommend that I use to implement this on my local machine? Doing a little searching, I see the Python module FastMCP:

  • https://github.com/jlowin/fastmcp

Just looking or a recommendation on a simple MCP framework I can just use for this purpose.

In any case, results will vary depending on the implementation, prompts used, and the model used, so you'll need to test your processes and make tweaks until you get something that works reliably for your use case.

Right. I am expecting that I can just tell the model what MCP functions to call and with what arguments to build and test the code. This is what I did with VSCode Copilot Agent Mode. I had a CMake build of a C++ project configured with a complete build setting there in a subdir and I told the model (Claude Sonnet 4 and 3.7 Sonnet) exactly what commands to run to test building the object files, test executables, and what commands to run to run the test (to get feedback for the changes to the code it was making). The model seemed to smart enough to obey my instructions for the most part. (But it still tried to run terminal commands to drive the build itself, which is not going to be possible with OpenHands.)

I'd be interested in hearing back on how you proceed and can try to help if you have issues.

I will definitely post back here on how that goes.

An AI Agent can't get build and runtime feedback on C++ code it is changing without a C++ compiler and linker and the ability to run the linked C++ executables. How are people using OpenHands with C++ code without this type of feedback?

bartlettroscoe avatar Jun 15 '25 00:06 bartlettroscoe

@matty-fortune,

Yes, you would need to use MCP or do something custom that's similar.

10-4.

I have never implemented an MCP function before. From looking at:

  • https://docs.all-hands.dev/usage/mcp#model-context-protocol-mcp

it seems this would likely be a stdio MCP function?

You could use either STDIO or SSE (HTTP) using a localhost URL, which might be easier if you're more familiar with web server stuff than STDIO.

Is there a simple free MCP server you would recommend that I use to implement this on my local machine? Doing a little searching, I see the Python module FastMCP:

  • https://github.com/jlowin/fastmcp

Just looking or a recommendation on a simple MCP framework I can just use for this purpose.

Yeah, for Python either FastMCP or the official Python SDK, which uses FastMCP 1.0: https://github.com/modelcontextprotocol/python-sdk

There are more options if you're interested in TypeScript or other languages: https://medium.com/@FrankGoortani/comparing-model-context-protocol-mcp-server-frameworks-03df586118fd

In any case, results will vary depending on the implementation, prompts used, and the model used, so you'll need to test your processes and make tweaks until you get something that works reliably for your use case.

Right. I am expecting that I can just tell the model what MCP functions to call and with what arguments to build and test the code. This is what I did with VSCode Copilot Agent Mode. I had a CMake build of a C++ project configured with a complete build setting there in a subdir and I told the model (Claude Sonnet 4 and 3.7 Sonnet) exactly what commands to run to test building the object files, test executables, and what commands to run to run the test (to get feedback for the changes to the code it was making). The model seemed to smart enough to obey my instructions for the most part. (But it still tried to run terminal commands to drive the build itself, which is not going to be possible with OpenHands.)

Yeah, it's just a matter of making it reliable and handling edge cases, and you could actually allow it to use whatever commands it wants by instructing it to use a specific MCP tool that forwards and then runs the commands within your target container, but that's getting slightly more advanced.

I'd be interested in hearing back on how you proceed and can try to help if you have issues.

I will definitely post back here on how that goes.

👍

An AI Agent can't get build and runtime feedback on C++ code it is changing without a C++ compiler and linker and the ability to run the linked C++ executables. How are people using OpenHands with C++ code without this type of feedback?

I can't speak as to how people are using it for C++ code, but in general it would either be by using a custom sandbox (the Docker container the agent has access to) that has everything preinstalled, or by using MCP or similar to connect to another system that has everything installed (as you asked about), and using specialized prompts in either case. If you haven't considered the custom sandbox route, perhaps that's an option to look into. You mentioned separate container so I didn't think to mention it before: https://docs.all-hands.dev/usage/how-to/custom-sandbox-guide

matty-fortune avatar Jun 15 '25 18:06 matty-fortune

I should also add that in your specific use case of having a separate container, using SSE with a localhost URL will likely be much more straightforward than using STDIO, which would require a lot of manual plumbing work to get everything connected due to the containerization.

matty-fortune avatar Jun 15 '25 18:06 matty-fortune

If you haven't considered the custom sandbox route, perhaps that's an option to look into. You mentioned separate container so I didn't think to mention it before: https://docs.all-hands.dev/usage/how-to/custom-sandbox-guide

Thanks for pointing that out! However, those instructions say you need to start with a base Debian OS image? Is that really a requirement? Our images have a redhat base. Having to change the base image would kind of defeat the purpose of specialized environment containers that we have for our various systems. However, extending our containers (with additional container layers) with additional packages and tweaks is perfectly fine and would be expected, but changing out the guts is problematic.

What is special about a Debian base? Instead, could there be a required API that that sandbox container has to accept commands? Does the OpenHands container use docker exec to run commands in the sandbox container? If not, how is it controlling the sandbox container? It would be great to provide additional flexibility in this sandbox container.

Please let me know what you would advise:

  1. try using our redhat base image container as the sandbox container

  2. bringing in our redhat base container as an MCP tool

Seems option-1 would be a lot simpler?

bartlettroscoe avatar Jun 16 '25 18:06 bartlettroscoe

Ah, yes, since it says it must be Debian based then I assume it must be Debian based. My assumption would be that some of the things it does requires, or at least presupposes, packages from Debian, which uses a different package manager than Red Hat. Perhaps that could be changed but it'd involve forking that code to use alternative packages, which would likely involve trial and error at the least.

Yes, the required API is essentially injected into the sandbox container by OpenHands. I believe OpenHands uses a client-server architecture to communicate with the runtime (sandbox) container through a RESTful API, similarly to how MCP using SSE works, with an HTTP server running inside of the container and the OpenHands code connecting to it as a client.

Option 1 would indeed likely be simpler if that worked for you, but it seems like it doesn't, and in any case it'd still involve custom prompts to explain to the agent that the available tools exist.

Option 2 is a solid option that shouldn't take all that long to get to a proof of concept that can be iterated on and it has the additional benefits of (1) being able to provide a consistent interface that'd be usable with other preexisting containers in the future and (2) avoiding lock-in to OpenHands, since MCP is a simple, open standard that's gaining traction, whereas the custom sandbox is more bespoke to OpenHands.

At this point, with the information so far, I'd suggest going the MCP route which seems like it'll surely be workable for your use case without any significant gotchas hiding anywhere (like the package manager issue above).

matty-fortune avatar Jun 16 '25 19:06 matty-fortune

Ah, yes, since it says it must be Debian based then I assume it must be Debian based. My assumption would be that some of the things it does requires, or at least presupposes, packages from Debian, which uses a different package manager than Red Hat.

That seems to be correct. When I tried using our RedHat-based image with:

-e SANDBOX_BASE_CONTAINER_IMAGE=clang-19.1.6-openmpi-4.1.6-trilinos-env:2025-06-05

I got the errors:

🐳 Starting Docker runtime...

E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)
15:06:06 - openhands:ERROR: docker.py:120 - Image build failed:
Command 'apt-get update' returned non-zero exit status 100.
15:06:06 - openhands:ERROR: docker.py:121 - Command output:
None
An error occurred: Command 'apt-get update' returned non-zero exit status 100.

And trying:

-e SANDBOX_RUTNIME_CONTAINER_IMAGE=clang-19.1.6-openmpi-4.1.6-trilinos-env:2
025-06-05

I got the errors:

🐳 Starting Docker runtime...

E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)
15:13:23 - openhands:ERROR: docker.py:120 - Image build failed:
Command 'apt-get update' returned non-zero exit status 100.

15:13:23 - openhands:ERROR: docker.py:121 - Command output:
An error occurred: Command 'apt-get update' returned non-zero exit status 100.

So the OpenHands container is assuming a lot about the base image of the runtime container base OS.

Option 2 is a solid option that shouldn't take all that long to get to a proof of concept that can be iterated on and it has the additional benefits of (1) being able to provide a consistent interface that'd be usable with other preexisting containers in the future and (2) avoiding lock-in to OpenHands, since MCP is a simple, open standard that's gaining traction, whereas the custom sandbox is more bespoke to OpenHands.

At this point, with the information so far, I'd suggest going the MCP route which seems like it'll surely be workable for your use case without any significant gotchas hiding anywhere (like the package manager issue above).

Okay, that seems reasonable. I will let you know how that goes. Thanks for all your help!

bartlettroscoe avatar Jun 17 '25 15:06 bartlettroscoe

That confirms it needs APT and working around that would probably turn out to be fairly complex. Happy to help 👍

matty-fortune avatar Jun 17 '25 22:06 matty-fortune

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jul 22 '25 02:07 github-actions[bot]

I am going to hopefully work this week on the FastMCP server approach to provide build tools and running tests through our custom RedHat-based container.

bartlettroscoe avatar Jul 22 '25 12:07 bartlettroscoe

FYI: We are taking a different approach. We are creating a derived container based on our base container and installing OpenHands CLI into that derived container. The derived container builds very fast. Then the OpenHand CLI is running in the same env as the custom compilers and tools so it can safely run any terminal command it wants, including build and test commands. And the developer has one system to log into to inspect what is happening I will report on our final results and the basics of the Dockerfile that makes this work.

Also, this approach should work for other similar coding agents as well like OpenAI Codex, Claude Code, or any agent you install and run locally. (VSCode + GitHub Copilot Chat Agent Mode with Dev Containers is a different type of beast.)

We are looking into how to further lock down these containers so that the only communication outside of the container (other than specific mounted directories from the host) is a specific model API endpoint (which are hopefully robust against attacks orchestrated by insane LLM-based agents with prompt attacks, etc.).

bartlettroscoe avatar Aug 29 '25 13:08 bartlettroscoe

We are creating a derived container based on our base container and installing OpenHands CLI into that derived container.

So far, this is working very well. By building a container derived from our specialized RedHat container and installing OpenHands in that derived container, I can have the agent perform a refactoring using our compilers, etc.

Now for my question up front, what is the recommended way to fully install the OpenHands software inside of a container image so that when you run it from the running container that it just runs and does not need to install anything else or require any network communication at all (other than the model endpoint)?

TL;DR:

With this setup, for example, I was able to run the following prompt with OpenHands:

TASK: Perform the Extract Function refactoring on the function Teuchos::CommandLineProcessor::parse() that is declared and defined in the files packages/teuchos/core/src/Teuchos_CommandLineProcessor.hpp and packages/teuchos/core/src/Teuchos_CommandLineProcessor.cpp. Extract out functions to make the code simpler and easier to maintain. Update the parent function to call the new functions.

STEPS: To perform this refactoring, perform the following steps:

**Step 1:** Create the initial changes to the file

Perform the Extract Function refactoring as described above and update the files Teuchos_CommandLineProcessor.hpp and Teuchos_CommandLineProcessor.cpp

**Step 2:** Get the object file Teuchos_CommandLineProcessor.cpp.o to build

To get feedback on the correctness of this refactoring, first get the object file Teuchos_CommandLineProcessor.cpp.o to build by running the command:

  cd BUILDS/clang-19-simple && ninja packages/teuchos/core/src/CMakeFile
s/teuchoscore.dir/Teuchos_CommandLineProcessor.cpp.o

Look at any compiler errors returned from that command and make fixes to those two files until the build errors are resolved.

**Step 3:** Get the test executable TeuchosCore_CommandLineProcessor_test.exe to build

After the object file Teuchos_CommandLineProcessor.cpp.o builds correctly from the previous step, get feedback on the linking of the test executable TeuchosCore_CommandLineProcessor_test.exe by running the build command:

  cd BUILDS/clang-19-simple && ninja packages/teuchos/core/test/CommandLineProcessor/TeuchosCore_CommandLineProcessor_test.exe

Fix any build or link errors by updating the source files.

**Step 4:** Run the test TeuchosCore_CommandLineProcessor_test

Get feedback on the correct functioning of the test by running the test using the command:

  cd BUILDS/clang-19-simple && ctest -VV -R TeuchosCore_CommandLineProcessor_test

If there are any test failures, update the source code to the files Teuchos_CommandLineProcessor.hpp and Teuchos_CommandLineProcessor.cpp and go back to Step 2 and Step 3 to get the object file and test executable to build before running the test again.

**Step 5:** Summarize the results of the refactoring and testing

Once the test TeuchosCore_CommandLineProcessor_test runs and passes successfully, summarize the refactoring that was done and the testing that was done.

At this point, you are done.

Using gpt-5, that refactoring worked perfectly and the agent run the exact commands using our underlying compiler and other tools installed in the base container listed in the prompt (as verified in the output).

Now for some details and my question ...

To build the derived container that installs OpenHands, as per the instructions at:

  • https://docs.all-hands.dev/usage/how-to/cli-mode

during the docker build, in the Dockerfile, I am running:

python -m pip install --trusted-host=pypi.org --trusted-host=files.pythonhosted.org uv
uv pip install --system --trusted-host=pypi.org --trusted-host=files.pythonhosted.org openhands-ai

But after staring up the container and running:

uvx --python 3.12 --from openhands-ai openhands

it is installing a bunch of packages.

We are working towards a setup where we run the derived container with OpenHands where the only external network access is to a single endpoint to call the model. No other external communication will be allowed. (And even then, we will likely need to run a proxy for the endpoint and log and filter what the agent is sending to the model endpoint.)

What is the recommended way to fully install the OpenHands software inside of a container image so that when you run it from the running container that it just runs and does not need to install anything else or require any network communication at all (other than the model endpoint)?

bartlettroscoe avatar Aug 31 '25 18:08 bartlettroscoe

Question, is running headless only supported using the development environment set up with poetry as documented at:

  • https://docs.all-hands.dev/usage/how-to/headless-mode

?

Why can't you run headless using the uv/uvx approach documented at:

  • https://docs.all-hands.dev/usage/how-to/cli-mode

?

bartlettroscoe avatar Sep 02 '25 14:09 bartlettroscoe

This issue is stale because it has been open for 40 days with no activity. Remove the stale label or leave a comment, otherwise it will be closed in 10 days.

github-actions[bot] avatar Oct 13 '25 02:10 github-actions[bot]

NOTE: We are just waiting on:

  • #10529
  • A full install of OpenHands during a derived container image build with no network communication when the derived container is run (see above).

and then we will give this a shot again. (But for now, we are getting by with Codex CLI calling gpt-oss models locally hosted.)

bartlettroscoe avatar Oct 13 '25 14:10 bartlettroscoe