pipeline icon indicating copy to clipboard operation
pipeline copied to clipboard

Allow PVC of completed PipelineRun to be deleted

Open jimmyjones2 opened this issue 3 years ago • 21 comments

Feature request

For PVCs specified as part of a PipelineRun with volumeClaimTemplate:

  • Once a PipelineRun has completed, remove the pvc-protection finalizer from the PVC to allow it to be deleted by the user (similar to https://github.com/argoproj/argo-workflows/issues/6629)
  • Provide an option to automatically delete the PVC on completion of PipelineRun

Use case

If a PVC is specified as part of a PipelineRun with volumeClaimTemplate, the PVC lifecycle is the same as the PipelineRun. In the case where PipelineRuns are kept for longer periods this is wasteful of storage resources and some users may want to delete the associated PVC before the PipelineRun, perhaps to keep logs, or because the PipelineRun is managed by a gitops controller.

See #1784 for when this was first noticed after PVC protection had been implemented in Kubernetes

jimmyjones2 avatar Nov 19 '22 14:11 jimmyjones2

If I remember it correctly from implementing volumeClaimTemplate. The PVC cannot be deleted because the Pod has the volume mounted, so the Pod needs to be deleted first.

jlpettersson avatar Feb 02 '23 18:02 jlpettersson

Hey @jlpettersson - a completed pod no longer has the PVC mounted - easily demonstrated as it doesn't prevent a RWO PVC from being mounted by another node. However the PVC protection feature prevents PVCs from being deleted if referenced by pods, even if in completed state.

The linked Argo workflows PR removes the PVC protection finalizer when it knows all the pods are completed and it's safe to do so

jimmyjones2 avatar Feb 02 '23 20:02 jimmyjones2

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale with a justification. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot avatar May 03 '23 21:05 tekton-robot

/remove-lifecycle stale Still an issue

jimmyjones2 avatar May 04 '23 07:05 jimmyjones2

/remove-lifecycle stale

vdemeester avatar May 04 '23 12:05 vdemeester

Just came across this issue this morning where I didn't expect the PVC to be seen after the pipelineRun was completed.

I am trying to understand... what is the use of such PVCs? Aren't they unnecessary reserving resources? Shouldn't this be a bug rather than a feature?

I think, the pvc created with volumeClaimTemplate should be deleted once the pipelineRun is completed.

/cc @skaegi

pritidesai avatar May 05 '23 16:05 pritidesai

Thanks @skaegi for the references: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#limitations

Deleting and/or scaling a StatefulSet down will not delete the volumes associated with the StatefulSet. This is done to ensure data safety, which is generally more valuable than an automatic purge of all related StatefulSet resources.

The statefulset is deleted when a pipelineRun is completed but the pvc created with volumeClaimTemplate is not cleaned up. We could at the least provide an option either cluster wide or per pipelineRun to clean up pvc so that the operators or the users can choose how they would like to configure their deployments.

pritidesai avatar May 05 '23 17:05 pritidesai

@pritidesai An option to clean up PVCs per PipelineRun would be awesome! I suppose in some cases people might want the option to keep them around for debugging?

For various reasons I have to keep completed PipelineRuns around, and if I were using volumeClaimTemplates each would consume it's own PVC which would be wasteful and expensive (I'm building 50+ apps). As a workaround I have to use a shared RWX PVC as the workspace with a subPath :vomiting_face:

jimmyjones2 avatar May 05 '23 18:05 jimmyjones2

Related: https://github.com/tektoncd/pipeline/issues/5412

jerop avatar May 08 '23 16:05 jerop

Hey @jimmyjones2 I have created a PoC here - https://github.com/tektoncd/pipeline/pull/6635 Please let me know if this works for you. Thanks! Once we finalize on the design, I will update the PR to include tests. The changes in the PR are tested locally.

pritidesai avatar May 09 '23 05:05 pritidesai

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale with a justification. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot avatar Aug 07 '23 06:08 tekton-robot

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten with a justification. Rotten issues close after an additional 30d of inactivity. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

tekton-robot avatar Sep 06 '23 06:09 tekton-robot

@QuanZhang-William Just tried v0.51.0 with coschedule: workspaces and was expecting the PVC to be deleted as per #6940 - however there seems to be a race condition - sometimes the PVC is deleted, but other times I was left with a PVC in Terminating state as the kubernetes.io/pvc-protection wasn't removed

jimmyjones2 avatar Sep 17 '23 16:09 jimmyjones2

Hi @jimmyjones2, https://github.com/tektoncd/pipeline/pull/6940 only supports purging PVC with coschedule:pipelineruns mode. The change is not applied to coschedule:workspaces mode due to backward compatibility concerns.

You can see more detail in this discussion

QuanZhang-William avatar Sep 18 '23 19:09 QuanZhang-William

Sorry @QuanZhang-William I wrote that incorrectly - when I tried coschedule: pipelineruns I sometimes ended up with terminating PVCs. With cochedule: workspaces I got the previous behavior of the PVC remaining after completion.

jimmyjones2 avatar Sep 19 '23 07:09 jimmyjones2

@jimmyjones2 . Thanks for reporting the bug! With some investigation, the flake is caused by the wrong order of purging finalizer and deleting PVCs.

It seems like finalizers cannot be patched/purged when the PVC is not in terminating status, so we need to delete the PVC first and then purge the finalizer.

I'm working on a PR to fix it.

QuanZhang-William avatar Sep 20 '23 20:09 QuanZhang-William

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen with a justification. Mark the issue as fresh with /remove-lifecycle rotten with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

tekton-robot avatar Oct 20 '23 21:10 tekton-robot

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen with a justification. Mark the issue as fresh with /remove-lifecycle rotten with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot avatar Oct 20 '23 21:10 tekton-robot

/reopen /remove-lifecycle rotten

vdemeester avatar Oct 23 '23 08:10 vdemeester

@vdemeester: Reopened this issue.

In response to this:

/reopen /remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot avatar Oct 23 '23 08:10 tekton-robot