xp icon indicating copy to clipboard operation
xp copied to clipboard

Add healthcheck endpoint to XP

Open gbbirkisson opened this issue 3 years ago • 3 comments

This is a feature request to add a health check endpoint to XP. This necessity originates from the operator but we can just as well try to accommodate for non-k8s use cases.

The operator has up to this point relied on the built in ES healthcheck: <host>:9200/_cluster/health?wait_for_status=green&timeout=1s. This has worked OK so far but it has its drawbacks. For example, If XP cannot write the blobstore, the startup will fail but ES will report that everything is OK.

We need an endpoint that will:

  • Have a similar query parameter of specifying an acceptable ES state (green or yellow).
  • Give a 200 OK when "XP is ready".

The first point is pretty trivial and we need that to do automatic rolling updates. We need to be able to specify that a data node is not ready until the ES cluster is green. This is not as important for pure frontend nodes, and they can be ready when ES is yellow.

The second point is a bit more complicated. The question is, when do we deem XP ready? This is up for debate but the conditions could include:

  • ElasticSearch is up
  • Hazelcast is up (in clustered setting)
  • XP started listening on 8080
  • ...

gbbirkisson avatar Nov 08 '21 12:11 gbbirkisson

Aren't you describing the "alive app" here?

sigdestad avatar Nov 08 '21 17:11 sigdestad

@sigdestad No I am not. That app does not allow for that yellow/green functionality at the moment.

Also, I am of the opinion that this is a core feature that should be in XP, not in an app. Preinstalling a bunch of apps to get XP to run properly on K8s is not the way to go.

gbbirkisson avatar Nov 09 '21 08:11 gbbirkisson

Relevant discussion: https://github.com/enonic/xp-operator/issues/65

gbbirkisson avatar May 13 '22 06:05 gbbirkisson

work continues in #10094 and #10096

vbradnitski avatar Mar 28 '23 19:03 vbradnitski

easticsearch yellow/green status is for the entire cluster and cannot tell if XP node is alive/healthy. We decided to go the other direction (see above)

rymsha avatar Apr 11 '23 11:04 rymsha