server
server copied to clipboard
Changing the gRPC protocol to implement standard gRPC Health Checking Protocol
Is your feature request related to a problem? Please describe. At the current moment Triton is using its own protocol to define readiness and healthiness in the gRPC protocol, here you can find the definition for triton.
On the other hand, gRPC has a standard way of handling this described here: Link to gRPC docs
Kubernetes now support healthiness and readiness check through gRPC from version 1.23. If triton implements that, we can use it easily through Kubernetes.
Describe the solution you'd like Change two rpc calls to gRPC standards.
Hi @omidb,
Thanks for raising this. From a quick glance, it looks like given current Triton protocols, you would embed a simple GRPC app into the server container and setup liveness/readiness probes to query the server health similar to this : https://github.com/grpc-ecosystem/grpc-health-probe#example-grpc-health-checking-on-kubernetes.
Is this what you are doing today? Or do you currently have an alternative?
@tanmayv25 what do you think of this? I believe the request is to change ServerLive/ServerReady to this GRPC standard instead. Given backwards compatibility constraints, maybe new RPCs could be added in addition and old ones phased out eventually if we want to move forward with this. This will allow Kubernetes to probe GRPC server health as a built-in feature moving forward with K8s > 1.23.
hi @rmccorm4, that can be a good alternative, it can be a little challenging to have to server up on the same port. Right now, I'm getting warnings for liveliness and healthiness probs from Kubernetes, so long story short, no alternative at the current moment. Knative doesn't let you expose multiple ports, unfortunately.
I agree, that for backward compatibility we can just add the new rpc calls to the Triton server protocol.
I think it is okay to implement these RPCs in Triton if it makes the K8s pipeline easier for the users. @rmccorm4 Can you open a ticket for this enhancement?
Filed ticket DLIS-3949
@rmccorm4 @tanmayv25 Hi, I am facing a different issue, but the solution to that can be this enhancement.
Is there any timeline defined for this feature? Or Triton Image Version wise?
The health check has now been merged in here: https://github.com/triton-inference-server/server/pull/5267
It is on the main branch and should be in the official 23.02 release.