kubectl
kubectl copied to clipboard
`kubectl get sts` cannot show its status if it is in terminating
The current behavior
[root@demo-dev-master-01 ~]# kubectl get sts
NAME READY AGE
prometheus-insight-agent-kube-prometh-prometheus 1/1 8d
If the stateful set is terminating with a finalizer, it will block there. But, if an admin debugs using kubectl get, he could not see anything about the terminating status. (The only signal is that its metadata.deletionTimestamp is not nil.)
- For namespace/pod, the terminating status is quite visible.
What would you like to be added: An easy-seeing status about sts/deployment is in terminating status.
some proposals:
- but this is a significant behavior change, some shells will mistakenly take the status as the Name.
[root@demo-dev-master-01 ~]# kubectl get sts
NAME READY AGE
prometheus-insight-agent-kube-prometh-prometheus(Terminating) 0/1 8d
- add new status in default get-table. (Or show the status only with
-o wideby default)
[root@demo-dev-master-01 ~]# kubectl get sts
NAME READY AGE Status
prometheus-insight-agent-kube-prometh-prometheus 0/1 8d Terminating
- add a warning event if a sts/deploy is not deleted after being marked as Terminating for a long period(maybe 1h). Why do I choose 1h here? The default event-ttl in kube-apiserver is 1h. Even when the user describes the statefulset/deployment after 1h, the deleting event will not be there and the admin will still be very confused about why the STS/deploy is not creating a new pod.
The event can just show a message that the object hangs for termination.
Why is this needed: As there are more and more operators, the finalizer is used widely, which has become a big problem for Ops admins.
When I run get deployments I have a few more columns than are shown in the issue:
❯ kubectl get deployments --all-namespaces
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system coredns 2/2 2 2 130d
local-path-storage local-path-provisioner 1/1 1 1 130d
Do the READY, UP-TO-DATE, and AVAILABLE fields cover this?
cc @brianpursley
/triage accepted
I looked into this some today, and here is what I found...
The pod's "Status" table value appears to come from here: https://github.com/kubernetes/kubernetes/blob/421ca53be49c4bd64a0c5ce9ceb7c3e17e6e1d11/pkg/printers/internalversion/printers.go#L918-L922
It looks like pod, pv, and pvc are the only ones that indicate that they are terminating when DeletionTimestamp is set.
I like the idea of adding a status column, but we will need to figure out what it should say when it is not terminating. For example, if there is a deployment and 2/3 are Ready, what should it say for the status of the deployment in that case?
I think we also need to make sure adding a new column is not considered a breaking change. I'm pretty sure we've said in the past that the table format is not considered an API and could be changed, so this should not be a problem.
Finally, what about describers? Should we update the output of kubectl describe to indicating that the resource is terminating when it has a DeletionTimestamp? It seems like it would also be helpful to see that there.
/assign
Finally, what about describers? Should we update the output of
kubectl describeto indicating that the resource is terminating when it has a DeletionTimestamp? It seems like it would also be helpful to see that there.
:+1:
This issue has not been updated in over 1 year, and should be re-triaged.
You can:
- Confirm that this issue is still relevant with
/triage accepted(org members only) - Close this issue with
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
given the discussion on the PR as well as here, i think this issue is relevant. I'd love to see some status field for statefulset also it will help in consistency and a convenience to the user. /triage accepted
Just for discussion purpose , the last PR was on hold due to this comment.
FWIW, I see this behaviour with a Deployment, not just a Stateful Set, having the kubernetes finalizer as well. What we do about changing get's output is debatable, but we could surely add DeletionTimestamp to describe output for some clarity.
If that makes sense, I'd like to open a PR for that part and then drive the conversation for how/if we could/should change the output of get in such a scenario.
@mpuckett159 @brianpursley @lmktfy WDYT?