orchard
orchard copied to clipboard
Add end-to-end exec support across controller, worker, and CLI
If merged, this PR will:
- Add the orchard exec CLI command that waits for a VM, streams stdin/stdout, and handles TTY resize (
internal/command/exec/*.go,pkg/client/vms.go) - Deliver exec watch actions and worker rendezvous over HTTP and gRPC, including controller WebSocket bridging (
internal/controller/api_vms_exec.go,internal/controller/api_rpc_exec.go,internal/controller/api_rpc_watch.go,pkg/client/rpc.go,rpc/constants.go) - Teach the worker to proxy exec frames between the controller and the guest agent with VM lookup retries and legacy stream fallbacks (
internal/worker/exec.go,internal/worker/rpc.go,internal/worker/rpcv2.go,internal/worker/worker.go) - Define the exec frame schema plus guest agent service stubs so both sides share the same JSON envelope and gRPC API (
internal/execstream/frame.go,rpc/guestagent/guestagent.proto,rpc/guestagent/*.pb.go) - Extend controller and worker capability metadata so clients can discover exec support (
pkg/resource/v1/v1.go,pkg/resource/v1/worker.go,pkg/resource/v1/watch_instruction.go,internal/controller/api_controller.go) - Cover the new behavior with an integration test that starts a VM and verifies the frame sequence from a remote
/bin/echo(internal/tests/integration_test.go)
Testing:
- Ran
go test ./... - Ran
go test ./internal/tests -tags=integration - Ran manual non-interactive exec against a running VM
- Ran manual
--interactiveexec with--ttyto confirm resize and stdin handling
Is the idea here is to be able to run commands inside VMs which doesn't have SSH enabled? If SSH enabled VM is an option then Orchard controller then controller as an SSH jump host might be an option.
You’ve got it—the driver is exactly those cases where SSH isn’t available or desirable. Orchard’s exec path lets us reach VMs without opening SSH ports or distributing keys, and it keeps everything on the controller’s existing channel (auth, audit, policy). If a VM already exposes SSH and you’re comfortable running a jump host, that can work, but it reintroduces credential management, extra network surface, and more moving parts. Exec avoids that overhead while still giving us the command execution capability we need.
That's the thinking, anyway!
I think we should start implementing https://github.com/cirruslabs/orchard/issues/332 as a simple non-interactive endpoint first.
You pass the command to execute, and optionally a pre-determined standard input contents, and get a Transfer-Encoding: chunked response that streams standard output and error.
This should solve most of the day-to-day use-cases and wouldn't require the user to deal with WebSockets.
Would that work for you?
Makes sense to me! I agree that simple commands should be simple to integrate with (and only interactive should need to upgrade to a websocket).
@edigaryev thinking this through there are two neat ways we can slice the minimal API:
Option 1 · Simplicity First
POST /v1/vms/:name/exec (body carries optional stdin)
Controller responds 200 OK and streams newline-delimited JSON chunks such as {"event":"stdout","data":"…"}. One request in, one stream out; callers don’t need to juggle anything else.
Option 2 · Upgradable
Same POST, but the controller returns 201 Created with Location: /v1/vms/:name/exec/:session_id. The client then GETs that URL to receive the same chunked JSON stream.
That extra hop gives us a stable session resource we can reuse later—to reconnect to a long-running command, expose metadata (exit code, duration), or negotiate a WebSocket by hitting the same URL with ?interactive=true or similar. It also keeps the door open for future endpoints like POST …/stdin if we want to dribble more input.
Option 1 is dead simple today; Option 2 is almost as simple but buys a clear path to interactivity and reconnection without revisiting the public contract. Does Option 2 line up with what you had in mind, or would you rather we keep to Option 1 for the first cut?