server icon indicating copy to clipboard operation
server copied to clipboard

Changing the gRPC protocol to implement standard gRPC Health Checking Protocol

Open omidb opened this issue 3 years ago • 5 comments
trafficstars

Is your feature request related to a problem? Please describe. At the current moment Triton is using its own protocol to define readiness and healthiness in the gRPC protocol, here you can find the definition for triton.

On the other hand, gRPC has a standard way of handling this described here: Link to gRPC docs

Kubernetes now support healthiness and readiness check through gRPC from version 1.23. If triton implements that, we can use it easily through Kubernetes.

Describe the solution you'd like Change two rpc calls to gRPC standards.

omidb avatar Jul 08 '22 14:07 omidb

Hi @omidb,

Thanks for raising this. From a quick glance, it looks like given current Triton protocols, you would embed a simple GRPC app into the server container and setup liveness/readiness probes to query the server health similar to this : https://github.com/grpc-ecosystem/grpc-health-probe#example-grpc-health-checking-on-kubernetes.

Is this what you are doing today? Or do you currently have an alternative?


@tanmayv25 what do you think of this? I believe the request is to change ServerLive/ServerReady to this GRPC standard instead. Given backwards compatibility constraints, maybe new RPCs could be added in addition and old ones phased out eventually if we want to move forward with this. This will allow Kubernetes to probe GRPC server health as a built-in feature moving forward with K8s > 1.23.

rmccorm4 avatar Jul 08 '22 19:07 rmccorm4

hi @rmccorm4, that can be a good alternative, it can be a little challenging to have to server up on the same port. Right now, I'm getting warnings for liveliness and healthiness probs from Kubernetes, so long story short, no alternative at the current moment. Knative doesn't let you expose multiple ports, unfortunately.

I agree, that for backward compatibility we can just add the new rpc calls to the Triton server protocol.

omidb avatar Jul 09 '22 14:07 omidb

I think it is okay to implement these RPCs in Triton if it makes the K8s pipeline easier for the users. @rmccorm4 Can you open a ticket for this enhancement?

tanmayv25 avatar Jul 11 '22 17:07 tanmayv25

Filed ticket DLIS-3949

rmccorm4 avatar Jul 11 '22 17:07 rmccorm4

@rmccorm4 @tanmayv25 Hi, I am facing a different issue, but the solution to that can be this enhancement.

Is there any timeline defined for this feature? Or Triton Image Version wise?

PRIYANKArythem3 avatar Jul 29 '22 15:07 PRIYANKArythem3

The health check has now been merged in here: https://github.com/triton-inference-server/server/pull/5267

It is on the main branch and should be in the official 23.02 release.

the-david-oy avatar Jan 23 '23 21:01 the-david-oy