core icon indicating copy to clipboard operation
core copied to clipboard

relax ES checks in /api/v1/probes/startup

Open yolabingo opened this issue 6 months ago • 1 comments

Problem Statement

/api/v1/probes/startup returns 503 in situations when it should not. We are not sure of the details. It seems to occur if there is not an Active index, and possibly if the Active index is empty.

We use this for the startupProbe in kubernetes to determine when a new pod is healthy. Until it succeeds, the new pod is not fully in service. If the server is running a long-running reindex, the pod will be terminated by kubernetes while the reindex is in process, as the reindex time will exceed the default startup timeout we set for this probe. On a cluster environment, the first pod needs to be healthy before k8s will start other pods in the cluster.

Steps to Reproduce

Delete all indexes, restart dotCMS. /api/v1/probes/startup will return 503 until there is an Active index. Something like that anyway.

Acceptance Criteria

Not sure. Perhaps /api/v1/probes/startup can verify that Elasticsearch is reachable, without checking on the health of the Active index.

dotCMS Version

LTS and current

Proposed Objective

Reliability

Proposed Priority

Priority 2 - Important

External Links... Slack Conversations, Support Tickets, Figma Designs, etc.

https://dotcms.slack.com/archives/C072VF6R2JC/p1724294214346659

Assumptions & Initiation Needs

No response

Quality Assurance Notes & Workarounds

No response

Sub-Tasks & Estimates

No response

yolabingo avatar Aug 22 '24 19:08 yolabingo