zilla
zilla copied to clipboard
Add native support for gRPC health check
Is your feature request related to a problem? Please describe.
I'm trying to setup a gRPC proxy as a front-end to our Kafka server to be able to access it using gRPC (which we already widely use). This Zilla-based proxy I will deploy in a Kubernetes cluster. This cluster however requires me to have a health end-point, and in the case of being configured for gRPC it needs to implement the standard gRPC health check protocol. And if my proxy does not answer with SERVING the pod will be taken down.
When running locally I use the grpc-health-probe tool, and I believe this is fully compatible with how the Kubernetes gRPC health-check works.
https://github.com/grpc-ecosystem/grpc-health-probe
The incoming request looks like this:
/grpc.health.v1.Health/Check
The messages for the request and response in the proto looks like this:
message HealthCheckRequest {
string service = 1;
}
message HealthCheckResponse {
enum ServingStatus {
UNKNOWN = 0;
SERVING = 1;
NOT_SERVING = 2;
SERVICE_UNKNOWN = 3; // Used only by the Watch method.
}
ServingStatus status = 1;
}
Describe the solution you'd like
Ideally I would like to see native, built-in support for the gRPC health-check. For me it's enough to have it statically answer SERVING as long as Zilla is running properly, but others might need to be able to probe some of the upstream services and based on that respond accordingly.
Describe alternatives you've considered
An alternative solution would be that I could add a "direct response" value directly in the zilla.yaml file, telling Zilla to reply with SERVING. Something similar to config snippets below.
north_http_server:
type: http
kind: server
routes:
- when:
- headers:
:path: /grpc.health.v1.Health/*
exit: north_health_server
north_health_server:
type: direct_response
kind: server
options:
response:
value: 'SERVING'
code: 200
Having a way to specify direct response in a straight-forward way like this would actually be really useful. But maybe it can already be accomplished using catalogs?!
@cagecurrent we have first class support for gRPC in the grpc binding, so that seems like a natural place to potentially add support for the standard grpc.health.v1 service.
For example:
north_http_server:
type: http
kind: server
exit: north_grpc_server
north_grpc_server:
type: grpc
kind: server
options:
services:
- grpc.health.v1
exit: ...
This design aligns with the proposed approach for the grpc.reflection.v1 enhancement, see https://github.com/aklivity/zilla/issues/954
Please confirm that this would work for your scenario.
@jfallows, nice to see that you are working on this. Your solution seems ok, although I'm a bit too new to Zilla to be able to give any nuance in my response. As I mentioned, for me the most straight-forward way the health check to work is that it just say that Zilla is running. I could see another scenario, however, where you want to probe some other service and base the health-check response on that.
I also tried to get Zilla just proxy the gRPC health-check request to another service (a "dummy" program that was running in the same docker container, on a different port) to get status response. But I didn't get that to work, but that might have been more due to my limited Zilla knowledge and experience.
I would like to work on this ticket.