Allow sidecars to specify probe port
Currently the API for container livenessProbe and readinessProbe doesn't allow port to be specified since containerPort is already provided. In a multi-container use case, containerPort can only be specified on one container which provides a clear way to identify the container and port which receives traffic. As a side effect, it's currently not possible to configure a sidecar container with liveness/readiness probes using either httpGet or tcpSocket.
A non-ideal workaround for readinessProbe exists to design the primary container's health check code to also check the health of any dependent sidecars. This isn't a typical consideration of the service developer and would not work in many cases for third party containers.
Since k8s restarts the specific container that fails a livenessProbe and NOT the entire pod, this same workaround is not effective for livenessProbe and instead would result in unnecessary restarts of the primary container.
I can pick this up and work on a PR. before I do though...
Is there agreement that:
- This should be fixed
- The approach is to update the field mask to allow
livenessProbeandreadinessProbeto be defined on sidecar containers only.
cc @markusthoemmes
The workaround of proxying through the single exposed container doesn't work for exec probes, which are the defacto way of health checking GRPC services.
The point around the liveness probe restarting the container vs. the pod is another great point / problem with this suggested workaround.
My personal inclination would be to see us allow {liveness,readiness}Probe on sidecars.
cc @dprotaso @markusthoemmes @savitaashture cc @dgerd (for historical purposes, and because we miss you 😉 )
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.
Ack, allowing the probes seems okay for sidecar containers.
/lifecycle frozen
/area /kind api-change
/good-first-issue
Since it sounds like there's agreement on the general API shape. If not, remove /remove good-first-issue.
/triage acecpted
@evankanderson: This request has been marked as suitable for new contributors.
Please ensure the request meets the requirements listed here.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.
In response to this:
/area /kind api-change
/good-first-issue Since it sounds like there's agreement on the general API shape. If not, remove
/remove good-first-issue./triage acecpted
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@evankanderson: The label(s) area/api-change, triage/acecpted cannot be applied, because the repository doesn't have them.
In response to this:
/area /kind api-change
/good-first-issue Since it sounds like there's agreement on the general API shape. If not, remove
/remove good-first-issue./triage acecpted
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/kind api-change /triage accepted
/assign
Ok, so I started looking at this issue (as my "good first issue"), please remember that I am new with the internals. This is what I understood so far:
-
For the primary container (the one which specifies a containerPort) the readiness probe is cleared up (set to nil), because the queue proxy is going to execute the readynessprobe instead. And this is done here: https://github.com/knative/serving/blob/main/pkg/reconciler/revision/resources/deploy.go#L163
-
For the primary container (the one which specifies a containerPort) the liveness probe, if HTTP is used, is overriden with the
networking.BackendHTTPPortto be redirected to the queue proxy, If TCP is used the port provided by the user is used. -
For all the sidecars containers, this is not done
My questions based on this:
- Should we redirect the probes for the sidecar containers to the queue container as it is done for the main container? In other words, the sidecars' probes should be executed by the kubelet? I can't think of a reason why they shouldn't
- Internally no manipulation is done for sidecars probes, but there is a validation web hook screaming at you when you try to add probes to sidecar containers:
Error from server (BadRequest): error when creating "service.yaml": admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: must not set the field(s): spec.template.spec.containers[1].livenessProbe, spec.template.spec.containers[1].livenessProbe.initialDelaySeconds, spec.template.spec.containers[1].livenessProbe.periodSeconds
Update: This validation can be turned of by removing this check: https://github.com/knative/serving/blob/main/pkg/apis/serving/k8s_validation.go#L362
It will be great for me to understand historically why this was not allowed. As my gut feeling tells me that having probes coming from different places might cause some race conditions or random behaviour.
I will keep adding my findings here until I have a PR with some changes to discuss in more detail, if someone can shed some light on the expected behaviour that will be great.
@dprotaso @evankanderson I've created a simple PR with a test showing that the probes are maintained if the validation is removed from the validation web hook. The question still remains, do we want the sidecar containers probes to be performed by the kubelet?
I feel that I am missing context here to make a decision on that. In my head it makes some sense to only manage the "serving" container. But at the same time I do realise that I don't fully understand the entire lifecycle of KServices, which might be requiring the queue proxy container to run the probes for sidecars as well.
/unassign @salaboy
/assign
Hi @dprotaso! I would love to contribute to knative, but i am new to this project, can you please give me some pointers on what to change?
Hey, I'm not Dave, but I saw your note today.
I think the notes left by @salaboy are roughly correct -- when we added sidecar containers, we weren't sure how liveness and readiness probes should interact with the overall Pod health and in particular, the ability of the application and queue-proxy to serve traffic.
I think relaxing the validation Mauricio pointed to is possible and "safe" -- the largest risk would be an increase in cold-start time for pods which define readiness probes on their additional containers.
We intercept the probes on the primary container (as determined by specifying one containerPort, rather than zero) to enable the queue-proxy and activator to probe for initial readiness at a faster interval than supported by kubelet, but this mechanism should either continue to work (since it was independent of kubelet-reported readiness) or should introduce additional readiness delay (delaying cold start, but this can be a documented outcome).
Is this issue still valid? I'm new to Knative and I want to contribute. Based on the history of this issue, the PR (that was raised and closed) and the code, what I understand is relaxing the liveliness probe validation on the sidecar container would be safe to implement but we still need to keep primary container validation based on the containerPort. I'm I getting this right or are there some other data points that I need to understand? @evankanderson @dprotaso
I don't think any work has happened on this since my comment above, so this should be available for work.
In that case can I be assigned the issue? @evankanderson
/assign @KaranJagtiani
We intercept the probes on the primary container (as determined by specifying one containerPort, rather than zero) to enable the queue-proxy and activator to probe for initial readiness at a faster interval than supported by kubelet, but this mechanism should either continue to work (since it was independent of kubelet-reported readiness) or should introduce additional readiness delay (delaying cold start, but this can be a documented outcome).
The activator also does active probing (to the queue-proxy) after initial readiness in order track healthy endpoints for load balancing decisions.
I think relaxing the validation Mauricio pointed to is possible and "safe"
Thinking about this I'd say it's not - we're deviating from what's considered a 'ready' Pod by only considering a subset of containers.
So it seems like we could:
- Ensure all probes go through the queue proxy (excluding exec probes)
- When probes exist on the sidecar containers, the activator skips it's probing and rely's on the K8s API to understand Pod readiness (this already happens)
To elaborate a bit more - even if we complete #1 since our sidecar is unable to perform exec probes we should probably fall back on #2 - and have the activator look at pod readiness
I believe we currently disable probing when we have an exec probe configured for a revision
I believe we currently disable probing when we have an exec probe configured for a revision
Only partially, we still do some TCP probing from Q-P. A full test set can be found in the FT. I'm working on a PR https://github.com/knative/serving/pull/14853 to enable multi-container probing that works for liveness and readiness probes.
/assign