website
website copied to clipboard
Elaborate the possibility of container termination to complete before Endpoints reconciliation
This is a Feature Request The section [1] should mention the possibility of all containers being terminated before the Pod has been removed from associated Endpoint resources. As this could cause some failed connections until the Endpoint controller has completed reconciliation.
1 - https://github.com/kubernetes/website/blob/main/content/en/docs/concepts/workloads/pods/pod-lifecycle.md?plain=1#L415
What would you like to be added An elaboration on possible flows, rather than just the example flow. Clarify the potential of container termination to complete before Endpoints reconciliation.
Why is this needed To make Kubernetes users aware of a possible pitfall, causing minor network disruptions.
How I'd tackle this:
- update https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination-forced to mention that container failure can lead to forced Pod termination (for example: if the restart policy is
Never) - also update https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination-forced to mention that clients might try sending traffic to containers of the failed Pod, even as that terminated Pod is no longer in a position to accept the traffic
- and that node-level traffic forwarding can cease as soon as the Pod is deleted from the API, separately from whether the container runtime has or hasn't shut down the actual containers.
/language en
Sounds like a good idea, Tim.
I also think that the content on graceful Pod deletion [1] should reflect these issues, caused by eventual consistency. When performing a regular rolling update, there is also a potential for opening connections to terminating pods - as Kubernetes does not (from my understanding) enforce Endpoint updates to take place (and eventually propagate to service proxy) before Kubelet starts SIGTERM-ing the containers of the terminating Pod. Kubelet and Endpoint controller operate asynchronously, paying no attention to each other. If I'm wrong and an order is enforced , then I don't think it is reflected at the moment (also from [1]):
3. At the same time as the kubelet is starting graceful shutdown...
...
Pods that shut down slowly cannot continue to serve traffic as load balancers (like the service proxy) remove the Pod from the list of endpoints as soon as the termination grace period begins.
I've seen multiple articles [2] addressing this pitfall, and think it should be addressed or at least acknowledged by the official documentation.
1 - https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination 2 - https://engineering.rakuten.today/post/graceful-k8s-delpoyments/
/triage accepted /sig network /priority backlog
https://github.com/kubernetes/kubernetes/pull/110191#issuecomment-1142294392 says:
From what I can find within the code, K8S has always assumed that a pod without a readiness probe stays in ready=true until it has fully terminated. This PR fixes the case where a pod is defined with a readiness probe, is being terminated, and needs to run the readiness probe on termination.
We should document that Pods that back Services should have a readiness probe defined, wherever this problem behavior (sending traffic to terminating pods) is not wanted. Also, the app code should try to fail the readiness probe as soon as Pod termination is signalled.
We should mention this in https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination and also update https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#when-should-you-use-a-readiness-probe to link there for details.
Optionally, also update https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/ to link to https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination
Extra credit: add a diagram somewhere (might be easier once https://github.com/kubernetes/website/pull/36675 has merged)
The overall idea here is to probe Pods for readiness, even during shutdown, and use that readiness information to drop backends out of Services more gracefully.
Because unexpected failures can happen, workloads should also be prepared to handle the case where a Pod disappears from the API, or the container fails hard, even whilst traffic directed to that Pod is in flight.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
This issue has not been updated in over 1 year, and should be re-triaged.
You can:
- Confirm that this issue is still relevant with
/triage accepted(org members only) - Close this issue with
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.