zfs-localpv icon indicating copy to clipboard operation
zfs-localpv copied to clipboard

fix(controller): prevent volume deletion if snapshot exists

Open akhilerm opened this issue 3 years ago • 4 comments

Why is this PR required? What issue does it fix?:

  • Volume deletion should be stopped if a clone/snapshot exists
  • Snapshot deletion should be stopped if a clone exists

What this PR does?:

  • add a check in DeleteVolume() call in controller side to validate that no snapshot/clone exists before proceeding to volume deletion
  • add a check in DeleteSnapshot() call in controller side to validate that no clone exists before proceeding to snapshot deletion

Does this PR require any upgrade changes?: No

If the changes in this PR are manually verified, list down the scenarios covered::

  1. Create a PVC, csi-zfspv
  2. Create a snapshot zfs-snap from the above pvc
  3. Create a clone zfs-snap-clone from the above snapshot
  4. Create a clone zfs-vol-clone from the pvc csi-zfspv
  5. Delete the volume csi-zfspv. Deletion fails with the error
E0624 07:47:30.582195       1 grpc.go:79] GRPC error: rpc error: code = Internal desc = failed to handle delete volume request for {pvc-53e5ce41-5755-4833-b6c4-bb48188b9293} with 1 snapshots
  1. Delete the snapshot zfs-snap. The deletion fails with the error in controller
E0624 08:20:14.395489       1 grpc.go:79] GRPC error: rpc error: code = Internal desc = failed to handle delete snapshot request for {pvc-53e5ce41-5755-4833-b6c4-bb48188b9293@snapshot-2ca86a72-6153-423a-9604-78594aa03517} with 1 clones
  1. Delete the clone zfs-snap-clone and the snapshot will be deleted.
  2. But now the delete volume csi-zfspv will be failing with error
E0624 08:21:16.433858      1 grpc.go:79] GRPC error: rpc error: code = Internal desc = failed to handle delete volume request for {pvc-53e5ce41-5755-4833-b6c4-bb48188b9293} with 1 clones

since a clone also exists for this volume 8. Delete the clone zfs-vol-clone and the volume gets deleted

Any additional information for your reviewer? : NOTE Currently to find the clones of a snapshot, all volumes are listed and then searched for a matching snapname. This process can be made easier if a label is added to the clone volumes created from a snapshot

Checklist:

  • [x] Fixes #123
  • [x] PR Title follows the convention of <type>(<scope>): <subject>
  • [x] Has the change log section been updated?
  • [ ] Commit has unit tests
  • [ ] Commit has integration tests
  • [ ] (Optional) Are upgrade changes included in this PR? If not, mention the issue/PR to track:
  • [ ] (Optional) If documentation changes are required, which issue on https://github.com/openebs/openebs-docs is used to track them:

akhilerm avatar Jun 22 '21 14:06 akhilerm

@akhilerm csi-sanity is failing. we need to have waitforSnapDestroy function similar to volume destroy https://github.com/openebs/zfs-localpv/blob/b32da5a16580d358f26c19878158616f0bb82a80/pkg/driver/controller.go#L169. We should wait in the DeleteSnapshot call

	if _, ok := parameters["wait"]; ok {
		if err := waitForSnapDestroy(snapName); err != nil {
			return nil, err
		}
	}

pawanpraka1 avatar Jun 24 '21 12:06 pawanpraka1

@pawanpraka1 But parameters are not present in the DeleteSnapshot() call. How should I get the parameters?

akhilerm avatar Jun 27 '21 14:06 akhilerm

@pawanpraka1 Updated with the changes for waitForSnapDestroy

akhilerm avatar Jun 28 '21 05:06 akhilerm

May I suggest there to be an exception to the constraint that a volume can't be deleted if there is a snapshot created for that volume. Let me explain on the ZFS level...

Suppose we have a ZFS volume vol1 an create 3 snapshots for it: vol1@snap1, vol1@snap2, vol1@snap3. Suppose we take the oldest of those snapshots: vol1@snap3 and create a volume clone from it: vol2. Now we have these dependencies:

vol1 <- vol1@snap1 vol1 <- vol1@snap2 vol1 <- vol1@snap3 <- vol2

ZFS can turn these dependencise upside-down with "zfs promote vol2" command (https://openzfs.github.io/openzfs-docs/man/master/8/zfs-promote.8.html) and create this:

vol2 <- vol2@snap1 vol2 <- vol2@snap2 vol2 <- vol2@snap3 <- vol1

Now after that promotion of vol2, vol1 may be deleted as nothing is dependent on it.

So what I suggest is the following exception to the constraint you are adding with this pull request:

  • if the volume being deleted has several snapshots and the oldest of them has a single cloned volume created, allow deletion of the volume but 1st promote the clone.

Perhaps even a better alternative would be to have a special CRD that would allow execution of "zfs promote" action. Or maybe an attribute on a PVC that would trigger the "zfs promote" action... In that case this pull request can do one thing only.

plevart avatar Sep 25 '23 21:09 plevart