flexmeasures icon indicating copy to clipboard operation
flexmeasures copied to clipboard

/liveness endpoint

Open nhoening opened this issue 3 years ago • 6 comments

An endpoint which checks if everything needed to provide the service is there. E.g. this can check if the postgres db is reachable.

From a Kubernetes-bases article:

Liveness probe. This is for detecting whether the application process has crashed/deadlocked. If a liveness probe fails, Kubernetes will stop the pod, and create a new one.

We can use this endpoint now where we monitor /ping.

nhoening avatar Oct 16 '21 11:10 nhoening

I think this task is completed:

https://github.com/FlexMeasures/flexmeasures/blob/main/flexmeasures/api/v3_0/health.py

victorgarcia98 avatar Feb 15 '24 13:02 victorgarcia98

Liveness and readiness are technically not the same, see the linked article.

We can move this issue further along and decide here:

  • is it a simple 200 response?
  • Is startup probe and liveness the same for us?

It might be simple to have an implementation here.

nhoening avatar Feb 15 '24 14:02 nhoening

In our case, I think we can let both to be the same endpoint (just a 200 response).

At first, I was thinking if readiness would also test for DB connection but the POD can be up while the database is not. Kubernetes would check both services independently.

victorgarcia98 avatar Feb 15 '24 14:02 victorgarcia98

Did you mean "liveness" when you wrote "readiness"? Because readiness is checking the database connections.

nhoening avatar Feb 15 '24 14:02 nhoening

True, I was thinking more on liveness. However, it applies in both. Let's say the DB falls, the web POD could still be ready (if it could connect) so when the DB is back can work again. In case to also check the DB state in the web POD, we would restart it when it wasn't needed.

victorgarcia98 avatar Feb 15 '24 14:02 victorgarcia98

I believe readiness checks the necessary conditions, while liveness only checks being alive itself. The web pod is not ready when the underlying DB is not alive.

(readiness also checks that by default)

By adding a simple liveness, we just increase compatibility with Kubernetes and similar structures.

nhoening avatar Feb 15 '24 15:02 nhoening