docs Document behavior of Kservice readinessProbe and queue-proxy

Describe the change you'd like to see

Document the behavior of the readinessProbe on a Knative Kservice, and how that is translated over to the Pod queue-proxy and service container.

Additional context

The KNative controller will turn the Knative service template into a Pod where the readinessProbe is applied to the queue-proxy, and the service container has no probe. The queue-proxy then re-implements the http-probe of the service container. This can lead to confusing behavior.

Jul 14 '22 13:07 xtaje

Needs more information from @knative/serving-wg-leads to document including code examples, etc.

Jul 15 '22 12:07 abrennan89

This can lead to confusing behavior.

Can you elaborate? Are you just looking for an explanation of why we do the probe rewrite?

Jul 15 '22 14:07 dprotaso

I think I understand what happens with the probes and probe re-writes now (it's a startup optimization). It would be good to document for users because it's not obvious that it's happening.

My particular use case involved porting an existing k8s service in python using uwsgi+flask. This application loads a large ML model on startup, which takes ~2-4 minutes before it can start serving HTTP requests. The readiness and liveness probes on the k8s service have a long initialDelaySeconds to account for this.

The confusing part of porting it to a knative service was that I could see the app container become ready before queue-proxy, and the url is able to start serving requests ~30-60s before the pod becomes ready, due to the readinessProbe on the queue-proxy. I think this is happening because k8s and queue-proxy are probing in parallel without any synchronization. It's probably not noticeable for fast starting containers.

This was confusing in part due to the readinessProbe rewrite, because the app container defaults to a TCP probe and becomes ready immediately, and I can see it getting HTTP probes in the logs before the ML model has loaded.

Jul 16 '22 17:07 xtaje

I'd like to understand how to define the port in the readiness probe definition. Do I specify no port? If so, does KNative add in its default port (8080) or is there a Kubernetes default? Do I specify a named port? If so, what name? Or, do I just put in 8080?

Aug 30 '22 00:08 adamzr

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

Nov 28 '22 01:11 github-actions[bot]

This issue or pull request is stale because it has been open for 90 days with no activity.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close

/lifecycle stale

Dec 28 '22 02:12 knative-prow-robot

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

Mar 29 '23 01:03 github-actions[bot]