Document behavior of Kservice readinessProbe and queue-proxy
Describe the change you'd like to see
Document the behavior of the readinessProbe on a Knative Kservice, and how that is translated over to the Pod queue-proxy and service container.
Additional context
The KNative controller will turn the Knative service template into a Pod where the readinessProbe is applied to the queue-proxy, and the service container has no probe. The queue-proxy then re-implements the http-probe of the service container. This can lead to confusing behavior.
Needs more information from @knative/serving-wg-leads to document including code examples, etc.
This can lead to confusing behavior.
Can you elaborate? Are you just looking for an explanation of why we do the probe rewrite?
I think I understand what happens with the probes and probe re-writes now (it's a startup optimization). It would be good to document for users because it's not obvious that it's happening.
My particular use case involved porting an existing k8s service in python using uwsgi+flask. This application loads a large ML model on startup, which takes ~2-4 minutes before it can start serving HTTP requests. The readiness and liveness probes on the k8s service have a long initialDelaySeconds to account for this.
The confusing part of porting it to a knative service was that I could see the app container become ready before queue-proxy, and the url is able to start serving requests ~30-60s before the pod becomes ready, due to the readinessProbe on the queue-proxy. I think this is happening because k8s and queue-proxy are probing in parallel without any synchronization. It's probably not noticeable for fast starting containers.
This was confusing in part due to the readinessProbe rewrite, because the app container defaults to a TCP probe and becomes ready immediately, and I can see it getting HTTP probes in the logs before the ML model has loaded.
I'd like to understand how to define the port in the readiness probe definition. Do I specify no port? If so, does KNative add in its default port (8080) or is there a Kubernetes default? Do I specify a named port? If so, what name? Or, do I just put in 8080?
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.
This issue or pull request is stale because it has been open for 90 days with no activity.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close
/lifecycle stale
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.