enterprise_gateway icon indicating copy to clipboard operation
enterprise_gateway copied to clipboard

Feature Request: Health check for enterprise gateway

Open esevan opened this issue 5 years ago • 3 comments

This is a feature request.

As EG becomes one of core mircroservices in scalable jupyter deployment, Reliability should be required to EG.

There're many works for reliability such as HA support, and session persistence, but I think the easiest way is to recover to desired status by restarting EG in crash.

If EG provides its liveness status via /healthz endpoint, we can easily diagnose the status of EG and restart it when it's not healthy.

Of course, industrial enterprise cluster provides great automation of recovering interface like Kubernetes Container Probes

If folks thumb up to this idea, I want to discuss about what and how unhealthy status can be tracked in EG.

esevan avatar Jun 27 '19 05:06 esevan

@esevan , i am interested in this feature request. Are you working on it, or this needs to be discussed?

achandak123 avatar Jun 23 '21 04:06 achandak123

@achandak123 , Hi Amit

Unfortunately, I'm not working on it and I don't think I can handle this since I'm working on anther project now. I'd appreciate it if someone on this thread could contribute to the feature.

esevan avatar Jun 23 '21 09:06 esevan

A liveness check can be performed using a GET against /api. This will return a JSON consisting of the version string of the underly Jupyter Server, and a version string of the current EG instance.

That said, I'd like to leave this open for further discussion. If the above is sufficient, I'm also happy to close the issue. :smile:

kevin-bates avatar May 20 '22 23:05 kevin-bates