orchard icon indicating copy to clipboard operation
orchard copied to clipboard

Add end-to-end exec support across controller, worker, and CLI

Open kroo-oai opened this issue 1 month ago • 6 comments

If merged, this PR will:

  • Add the orchard exec CLI command that waits for a VM, streams stdin/stdout, and handles TTY resize (internal/command/exec/*.go, pkg/client/vms.go)
  • Deliver exec watch actions and worker rendezvous over HTTP and gRPC, including controller WebSocket bridging (internal/controller/api_vms_exec.go, internal/controller/api_rpc_exec.go, internal/controller/api_rpc_watch.go, pkg/client/rpc.go, rpc/constants.go)
  • Teach the worker to proxy exec frames between the controller and the guest agent with VM lookup retries and legacy stream fallbacks (internal/worker/exec.go, internal/worker/rpc.go, internal/worker/rpcv2.go, internal/worker/worker.go)
  • Define the exec frame schema plus guest agent service stubs so both sides share the same JSON envelope and gRPC API (internal/execstream/frame.go, rpc/guestagent/guestagent.proto, rpc/guestagent/*.pb.go)
  • Extend controller and worker capability metadata so clients can discover exec support (pkg/resource/v1/v1.go, pkg/resource/v1/worker.go, pkg/resource/v1/watch_instruction.go, internal/controller/api_controller.go)
  • Cover the new behavior with an integration test that starts a VM and verifies the frame sequence from a remote /bin/echo (internal/tests/integration_test.go)

Testing:

  • Ran go test ./...
  • Ran go test ./internal/tests -tags=integration
  • Ran manual non-interactive exec against a running VM
  • Ran manual --interactive exec with --tty to confirm resize and stdin handling

kroo-oai avatar Oct 10 '25 18:10 kroo-oai

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Oct 10 '25 18:10 CLAassistant

Is the idea here is to be able to run commands inside VMs which doesn't have SSH enabled? If SSH enabled VM is an option then Orchard controller then controller as an SSH jump host might be an option.

fkorotkov avatar Oct 17 '25 16:10 fkorotkov

You’ve got it—the driver is exactly those cases where SSH isn’t available or desirable. Orchard’s exec path lets us reach VMs without opening SSH ports or distributing keys, and it keeps everything on the controller’s existing channel (auth, audit, policy). If a VM already exposes SSH and you’re comfortable running a jump host, that can work, but it reintroduces credential management, extra network surface, and more moving parts. Exec avoids that overhead while still giving us the command execution capability we need.

That's the thinking, anyway!

kroo-oai avatar Oct 17 '25 17:10 kroo-oai

I think we should start implementing https://github.com/cirruslabs/orchard/issues/332 as a simple non-interactive endpoint first.

You pass the command to execute, and optionally a pre-determined standard input contents, and get a Transfer-Encoding: chunked response that streams standard output and error.

This should solve most of the day-to-day use-cases and wouldn't require the user to deal with WebSockets.

Would that work for you?

edigaryev avatar Oct 17 '25 17:10 edigaryev

Makes sense to me! I agree that simple commands should be simple to integrate with (and only interactive should need to upgrade to a websocket).

kroo-oai avatar Oct 17 '25 17:10 kroo-oai

@edigaryev thinking this through there are two neat ways we can slice the minimal API:

Option 1 · Simplicity First POST /v1/vms/:name/exec (body carries optional stdin) Controller responds 200 OK and streams newline-delimited JSON chunks such as {"event":"stdout","data":"…"}. One request in, one stream out; callers don’t need to juggle anything else.

Option 2 · Upgradable Same POST, but the controller returns 201 Created with Location: /v1/vms/:name/exec/:session_id. The client then GETs that URL to receive the same chunked JSON stream.

That extra hop gives us a stable session resource we can reuse later—to reconnect to a long-running command, expose metadata (exit code, duration), or negotiate a WebSocket by hitting the same URL with ?interactive=true or similar. It also keeps the door open for future endpoints like POST …/stdin if we want to dribble more input.

Option 1 is dead simple today; Option 2 is almost as simple but buys a clear path to interactivity and reconnection without revisiting the public contract. Does Option 2 line up with what you had in mind, or would you rather we keep to Option 1 for the first cut?

kroo-oai avatar Oct 17 '25 18:10 kroo-oai