LitServe icon indicating copy to clipboard operation
LitServe copied to clipboard

health probes such as startupz, readyz

Open yifan opened this issue 1 month ago • 3 comments

🚀 Feature

liveness probe endpoints: startupz and readyz endpoint, specifically readyz when workers are ready

Motivation

When deploying LitServe in production (e.g., within Kubernetes), we need built-in liveness and readiness probe endpoints to manage pod lifecycle properly.

Currently, the main process and worker processes have no standardized way to report their health states. Since model loading (setup()) happens in separate worker processes, the main process cannot easily expose a meaningful readiness signal. This causes Kubernetes to mark pods as ready before workers are actually able to serve inference requests.

Pitch

Add built-in endpoints such as: • /startupz — returns 200 when the LitServer process has successfully started. • /readyz — returns 200 only when all worker processes have completed their setup() routines and are ready to serve requests.

These endpoints would allow production orchestrators (e.g., Kubernetes) to safely manage startup, readiness, and liveness of LitServe pods without requiring custom inter-process signaling.

Alternatives

Additional context

yifan avatar Oct 29 '25 20:10 yifan

Hi @yifan, how about using the existing /health endpoint? It also checks that all workers have started and completed setup: https://lightning.ai/docs/litserve/features/health-check

bhimrazy avatar Oct 30 '25 18:10 bhimrazy

@yifan is startupz and readyz a standard k8s thing?

williamFalcon avatar Oct 31 '25 13:10 williamFalcon

@williamFalcon @bhimrazy I am using healthz, and I don't think it is enough. Yes, startup probe and ready probe is a k8s thing.

startupz - the server is up, but not ready to receive requests, in my opinion, this is when litServer is up. readyz - the server is ready, this will be after all workers are setup and ready healthz - the liveness to make sure the server is still alive.

yifan avatar Oct 31 '25 16:10 yifan