sig-storage-lib-external-provisioner
sig-storage-lib-external-provisioner copied to clipboard
Provisioner does not allow rescheduling if a Node is deleted after a pod is scheduled
If a node is deleted while a pod is scheduled on a node (but before a claim is provisioned), a pod can become indefinitely stuck in a Pending state.
Typically when a failure occurs in provisioning, the provisioner will relinquish control back to the Scheduler, to reschedule the Pod somehwere else. This is done by removing the volume.kubernetes.io/selected-node
annotation from the PVC. The controller returns ProvisioningFinished
in provisionClaimOperation
. This can happen in the case when storage cannot be scheduled on the selected node: https://github.com/kubernetes-sigs/sig-storage-lib-external-provisioner/blob/master/controller/controller.go#L1420
However, if a Node becomes unavailable after it has been selected by the Scheduler, the provisioner will not remove this annotation, since it returns ProvisioningNoChange
in provisionClaimOperation
. This is potentially useful in some situations where there is eventual consistency for a Node to become available, once it has been selected. However, for the case when a Node is deleted, this is an unrecoverable condition, and requires the user to intervene (either by adding the exact node back (infeasible for dynamically provisioned node names), deleting/re-creating the pod and allowing the Scheduler to reschedule, or manually removing the selected-node
annotation on the PVC).
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue or PR with
/reopen
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue or PR with
/reopen
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/remove-lifecycle rotten
/reopen
@amacaskill: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Repro using VolumeSnapshot to delay provisioning: https://gist.github.com/pwschuurman/fd9c8c50889ce2382bcdca259c51d3e4
- Create a VolumeSnapshot that references a non-existent disk (or a disk that takes a lot of time to be copied in order for the VolumeSnapshot to become ready)
- Create a PVC that references the VolumeSnapshot as a DataSource
- Create a pod that references said PVC. Scheduler will select a node for the pod, and add the
volume.kubernetes.io/selected-node
annotation to the PVC. - While operation from (1) is still pending, delete the node that the PVC is selected for. This could happen under normal conditions due to node repair, upgrade, autoscaling.
- Once the VolumeSnapshot becomes ready, the provisioner will start to emit
failed to get target node
. PVC must be deleted (or annotation removed) to fix this problem.
Some ideas on how to handle this:
- Add a timeout that will remove the annotation after some period of time. If a
volume.kubernetes.io/selected-node
annotation becomes, stale remove it from the PVC. This is troublesome as some delays can take a long time (eg: waiting for snapshot to be created), and may not fit into a well define timeout period. - Update csi-provisioner to use an informer, rather than a lister. This would allow the provisioner to be aware of deletion events for a node, and remove the annotation for affected volumes. The provisioner would likely need to keep a cache of node -> volume, in order to update affected volumes.
- Update the scheduler to keep remove the annotation on node deletion.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale /triage accepted
I think this is the same issue as https://github.com/kubernetes/kubernetes/issues/100485
Another option we discussed is: remove the annotation when the provisioner tries to access a Node that doesn't exist by detecting errors.NewNotFound
/assign @sunnylovestiramisu
Reproduced the error by the following step:
-
kubetest --build --up
- Deploy a pd csi driver via
[gcp-compute-persistent-disk-csi-driver/deploy/kubernetes/deploy-driver.sh](https://goto.google.com/src)
- Create a storage class, create a pvc with annotation:
volume.kubernetes.io/selected-node
, create a pod - PVC stayed in PENDING state
- Check csi-provisioner logs via
k logs -n gce-pd-csi-driver csi-gce-pd-controller-container csi-provisioner
W0308 00:51:37.588114 1 controller.go:934] Retrying syncing claim "xxxxxx", failure 12
E0308 00:51:37.588141 1 controller.go:957] error syncing claim "xxxxxx": failed to get target node: node "non-exist-node" not found
I0308 00:51:37.588381 1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"task-pv-claim", UID:"xxxxxx", APIVersion:"v1", ResourceVersion:"4824", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to get target node: node "non-exist-node" not found
Manually testes with the fix https://github.com/kubernetes-sigs/sig-storage-lib-external-provisioner/pull/139
- Copy the sig-storage-lib-external-provisioner with the fix to external-attacher vendor
-
make container
of a new external-attacher image - Upload to GCR and then replace the driver link in stable-master image.yaml
- Spin up a k8s cluster on GCE via
kubetest --build --up
- Deploy a pd csi driver via
[gcp-compute-persistent-disk-csi-driver/deploy/kubernetes/deploy-driver.sh
- Create a storage class, create a pvc with annotation:
volume.kubernetes.io/selected-node
, create a pod - PVC in state "Successfully provisioned volume pvc-xxxxxx"
We should cherry-pick to external-provisioner 3.2, 3.3, 3.4
The release has been published in external-provisioner:
https://github.com/kubernetes-csi/external-provisioner/releases/tag/v3.3.1 https://github.com/kubernetes-csi/external-provisioner/releases/tag/v3.2.2 https://github.com/kubernetes-csi/external-provisioner/releases/tag/v3.4.1
/close
@sunnylovestiramisu: Closing this issue.
In response to this:
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.