allow starting without a running plugin
We use an external plugin. Sshpiperd will gracefully survive plugin restart (good), but refuses to start unless the plugin is already running. This makes for an inconvenient startup dependency order.
I’d like to relax this. Happy to do it by default, or behind an explicit flag.
Would you want a PR doing this?
(We’re going to do it regardless, the only question is whether you want it upstreamed and if so in what form.)
If our external plugin implements a grpc health check endpoint, and it's possible to modify how sshpiperd's gRPC plugin clients are instantiated, we can probably implement a simplistic "graceful startup" period that would allow us to start sshpiperd more independently of our plugin server:
- sshpiperd starts in "lame duck" mode (maybe the wrong name for it, but it would be waiting for plugin grpc connections to be healthy, and is not yet serving requests of its own)
- start dialing for grpc plugin endpoint connections, and begin timeout period
- if some minimum number of plugin grpc connections become healthy before the timeout elapses, sshpiperd leaves lame duck mode and starts serving its own requests
- otherwise if the timeout elapses without grpc connections becoming healthy, it exits with an error.
https://github.com/grpc/grpc-go/tree/master/examples/features/health
This wouldn't be a terribly complex setup (for our use case, at least), but all the usual caveats for doing health checks would still apply (exponential backoff for retries, use circuit breakers to mitigate risk of cascading failures etc).
otherwise if the timeout elapses without grpc connections becoming healthy, it exits with an error.
I would prefer to omit this.
Otherwise, this seems great.
I’d also be open to the much simpler mode in which sshpiperd simply starts and serves but is in a broken state due to the gRPC being AWOL.
let's just start with the simpler mode then, and consider adding health checks only if the simpler mode doesn't suffice for our situation
may i know what lame duck would be like, not accepting incoming tcp conn?
or rejecting?
there is no probe for sshpiper now, so i prefer something that can be detected by load balancer
regular cmd plugin has an impl that sshpiper exit when child crash https://github.com/tg123/sshpiper/blob/bc446f4cbf63b1e0d8c449f22f1868bdaaacb020/cmd/sshpiperd/main.go#L338C7-L338C12 i was thinking to let sshpiper to restart cmd plugin when they crash. however, chose to let container to restart everything which seems cleaner and eaiser
for grpc plugin, i agree to that no need to have a valid grpc when bootstrapping
the experience should be as same as grpc backend crash and sshpiper simply retrying grpc call when new ssh client comes.
some takeaways
- agree to start without valid grpc conn
- probe for lb if in
lame duckmode? - should cmd plugin align with net grpc?
Apologies for the long delay here. I started implementation and realized there were some subtleties.
The main sticking point is that we need an initial plugin connection to learn things like auth methods.
What I ended up doing locally is extracting plugin initialization, and then:
- calling it eagerly during startup
- wrapping the net.Listener to initialize plugins if never before initialized, single-flighted, on connect
I'll put some miles on it and see how it holds up...
feel free to let me know if anything i can help was very busy in daytime work recently and need some side-project coding to relax