kubectl icon indicating copy to clipboard operation
kubectl copied to clipboard

`kubectl get sts` cannot show its status if it is in terminating

Open pacoxu opened this issue 2 years ago • 10 comments

The current behavior

[root@demo-dev-master-01 ~]# kubectl get sts
NAME                                                READY   AGE
prometheus-insight-agent-kube-prometh-prometheus    1/1     8d

If the stateful set is terminating with a finalizer, it will block there. But, if an admin debugs using kubectl get, he could not see anything about the terminating status. (The only signal is that its metadata.deletionTimestamp is not nil.

  • For namespace/pod, the terminating status is quite visible.

What would you like to be added: An easy-seeing status about sts/deployment is in terminating status.

some proposals:

  1. but this is a significant behavior change, some shells will mistakenly take the status as the Name.
[root@demo-dev-master-01 ~]# kubectl get sts
NAME                                                READY   AGE
prometheus-insight-agent-kube-prometh-prometheus(Terminating)    0/1     8d
  1. add new status in default get-table. (Or show the status only with -o wide by default)
[root@demo-dev-master-01 ~]# kubectl get sts
NAME                                                READY   AGE Status
prometheus-insight-agent-kube-prometh-prometheus    0/1     8d Terminating
  1. add a warning event if a sts/deploy is not deleted after being marked as Terminating for a long period(maybe 1h). Why do I choose 1h here? The default event-ttl in kube-apiserver is 1h. Even when the user describes the statefulset/deployment after 1h, the deleting event will not be there and the admin will still be very confused about why the STS/deploy is not creating a new pod.

The event can just show a message that the object hangs for termination.

Why is this needed: As there are more and more operators, the finalizer is used widely, which has become a big problem for Ops admins.

pacoxu avatar Jun 21 '23 06:06 pacoxu

When I run get deployments I have a few more columns than are shown in the issue:

❯ kubectl get deployments --all-namespaces
NAMESPACE            NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
kube-system          coredns                  2/2     2            2           130d
local-path-storage   local-path-provisioner   1/1     1            1           130d

Do the READY, UP-TO-DATE, and AVAILABLE fields cover this?

cc @brianpursley

mpuckett159 avatar Jun 22 '23 05:06 mpuckett159

/triage accepted

mpuckett159 avatar Jun 22 '23 05:06 mpuckett159

I looked into this some today, and here is what I found...

The pod's "Status" table value appears to come from here: https://github.com/kubernetes/kubernetes/blob/421ca53be49c4bd64a0c5ce9ceb7c3e17e6e1d11/pkg/printers/internalversion/printers.go#L918-L922

It looks like pod, pv, and pvc are the only ones that indicate that they are terminating when DeletionTimestamp is set.

I like the idea of adding a status column, but we will need to figure out what it should say when it is not terminating. For example, if there is a deployment and 2/3 are Ready, what should it say for the status of the deployment in that case?

I think we also need to make sure adding a new column is not considered a breaking change. I'm pretty sure we've said in the past that the table format is not considered an API and could be changed, so this should not be a problem.

Finally, what about describers? Should we update the output of kubectl describe to indicating that the resource is terminating when it has a DeletionTimestamp? It seems like it would also be helpful to see that there.

brianpursley avatar Jun 26 '23 22:06 brianpursley

/assign

carlory avatar Jul 25 '23 09:07 carlory

Finally, what about describers? Should we update the output of kubectl describe to indicating that the resource is terminating when it has a DeletionTimestamp? It seems like it would also be helpful to see that there.

:+1:

sftim avatar Aug 08 '23 13:08 sftim

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot avatar Aug 07 '24 14:08 k8s-triage-robot

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 05 '24 14:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Dec 05 '24 15:12 k8s-triage-robot

given the discussion on the PR as well as here, i think this issue is relevant. I'd love to see some status field for statefulset also it will help in consistency and a convenience to the user. /triage accepted

Ritikaa96 avatar Dec 11 '24 04:12 Ritikaa96

Just for discussion purpose , the last PR was on hold due to this comment.

Ritikaa96 avatar Dec 11 '24 04:12 Ritikaa96

FWIW, I see this behaviour with a Deployment, not just a Stateful Set, having the kubernetes finalizer as well. What we do about changing get's output is debatable, but we could surely add DeletionTimestamp to describe output for some clarity.

If that makes sense, I'd like to open a PR for that part and then drive the conversation for how/if we could/should change the output of get in such a scenario.

@mpuckett159 @brianpursley @lmktfy WDYT?

dharmit avatar Aug 01 '25 07:08 dharmit