velero icon indicating copy to clipboard operation
velero copied to clipboard

Respect the "snapshot.storage.kubernetes.io/is-default-class: true" annotation of VolumeSnapshotClass when taking the CSI snapshot

Open ywk253100 opened this issue 1 year ago • 12 comments

Currently, Velero chooses the VolumeSnapshotClass with the annotation velero.io/csi-volumesnapshot-class: "true" added when taking CSI snapshot, there is an official annotation snapshot.storage.kubernetes.io/is-default-class: true introduced to specify a default VolumeSnapshotClass for VolumeSnapshot that don't request any particular class to bind to.

So we could refine the current VolumeSnapshotClass choosing logic as follows:

  1. Use the default VolumeSnapshotClass annotated with snapshot.storage.kubernetes.io/is-default-class: true
  2. Else use the VolumeSnapshotClass annotated with velero.io/csi-volumesnapshot-class: "true"

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • :+1: for "I would like to see this bug fixed as soon as possible"
  • :-1: for "There are more important bugs to focus on right now"

ywk253100 avatar Oct 14 '24 07:10 ywk253100

Is there any reason there isn't a third else for looping through drivers and picking one that works?

kaovilai avatar Oct 14 '24 18:10 kaovilai

Is there any reason there isn't a third else for looping through drivers and picking one that works?

Is this what you expect? This also makes sense to me.

  1. Use the default VolumeSnapshotClass annotated with snapshot.storage.kubernetes.io/is-default-class: true
  2. Else choose the VolumeSnapshotClass whose driver field matches the PVC's StorageClass
  3. Else use the VolumeSnapshotClass annotated with velero.io/csi-volumesnapshot-class: "true"

ywk253100 avatar Oct 15 '24 05:10 ywk253100

  1. Use the default VolumeSnapshotClass annotated with snapshot.storage.kubernetes.io/is-default-class: true
  2. Else choose the VolumeSnapshotClass whose driver field matches the PVC's StorageClass annotated with velero.io/csi-volumesnapshot-class: "true"
    1. since there can be multiple volumesnapshotclasses per driver, and you can add parameters in a different way for the same driver.
  3. Else choose the VolumeSnapshotClass whose driver field matches the PVC's StorageClass without annotation
  4. Else use the VolumeSnapshotClass annotated with velero.io/csi-volumesnapshot-class: "true" but driver does not match (not sure if this is even possible or not.. but last resort)

kaovilai avatar Oct 15 '24 18:10 kaovilai

https://github.com/vmware-tanzu/velero-plugin-for-csi/pull/178/files#diff-0f38f067df1a3a5e5fb78bd16bfeb63f7c7c89524abc32b98a875b6152474bb4

I believe snapshot.storage.kubernetes.io/is-default-class: true should come after all these 3 if

anshulahuja98 avatar Oct 17 '24 04:10 anshulahuja98

Because if we put this above the velero annotation, the existing behaviour will break in some sense for users.

anshulahuja98 avatar Oct 17 '24 04:10 anshulahuja98

If this is breaking change we can add feature flag to enable the new behavior.

kaovilai avatar Oct 17 '24 14:10 kaovilai

If this is breaking change we can add feature flag to enable the new behavior.

I don't think the feature flag is a good option because this will introduce another configuration.

We should try to avoid introducing the break change, so how about choosing the VolumeSnapshotClass as the following priority:

  1. The VolumeSnapshotClass annotated with velero.io/csi-volumesnapshot-class: "true"
  2. The VolumeSnapshotClass annotated with snapshot.storage.kubernetes.io/is-default-class: true
  3. The VolumeSnapshotClass whose driver field matches the PVC's StorageClass

And report error if the above logic matches more than 1 VolumeSnapshotClass

ywk253100 avatar Oct 18 '24 08:10 ywk253100

// If a snapshot class is sent for provider in PVC annotations, use that
snapshotClass, err := GetVolumeSnapshotClassFromPVCAnnotationsForDriver(pvc, provisioner, snapshotClasses)
if err != nil {
	log.Debugf("Didn't find VolumeSnapshotClass from PVC annotations: %v", err)
}
if snapshotClass != nil {
	return snapshotClass, nil
}

// If there is no annotation in PVC, attempt to fetch it from backup annotations
snapshotClass, err = GetVolumeSnapshotClassFromBackupAnnotationsForDriver(backup, provisioner, snapshotClasses)
if err != nil {
	log.Debugf("Didn't find VolumeSnapshotClass from Backup annotations: %v", err)
}
if snapshotClass != nil {
	return snapshotClass, nil
}

// fallback to default behaviour of fetching snapshot class based on label on VSClass 
// velero.io/csi-volumesnapshot-class: "true"
snapshotClass, err = GetVolumeSnapshotClassForStorageClass(provisioner, snapshotClasses)
if err != nil || snapshotClass == nil {
	return nil, errors.Wrap(err, "error getting volumesnapshotclass")
}

// fallback to default behaviour of fetching snapshot class based on label on VSClass 
// snapshot.storage.kubernetes.io/is-default-class: true
snapshotClass, err = GetVolumeSnapshotClassForStorageClass(provisioner, snapshotClasses)
if err != nil || snapshotClass == nil {
	return nil, errors.Wrap(err, "error getting volumesnapshotclass")
}

anshulahuja98 avatar Oct 18 '24 09:10 anshulahuja98

@kaovilai / @ywk253100 how does above draft look?

anshulahuja98 avatar Oct 18 '24 09:10 anshulahuja98

Looks good, missing the third one from https://github.com/vmware-tanzu/velero/issues/8294#issuecomment-2421855002

kaovilai avatar Oct 18 '24 12:10 kaovilai

I personally don't see a need for it at this point.

We should expect customer to put either of snapshot.storage.kubernetes.io/is-default-class: true snapshot.storage.kubernetes.io/is-default-class: true

This keeps the behaviour deterministic. Let me know if there is any user ask for this.

anshulahuja98 avatar Oct 21 '24 05:10 anshulahuja98

It just removes a step from pre-requisite that's all. Many local/dev cluster like KinD or crc would most likely only have one. This would keep velero install scripts generic for several local cluster environments. But that can also be done outside of velero so I am ok skipping non-deterministic for velero.

kaovilai avatar Oct 21 '24 16:10 kaovilai

IMHO, although this issue improves user experience, it may not be a high priority for v1.16.

reasonerjt avatar Oct 31 '24 08:10 reasonerjt

I agree with you @reasonerjt But this might be a good candidate for newcomers.

anshulahuja98 avatar Nov 08 '24 09:11 anshulahuja98

Hi! I am new to Velero and this issue looks like a good start. Can I get this assigned?

TheRealSibasishBehera avatar Dec 05 '24 19:12 TheRealSibasishBehera

I currently don’t have the time to work on this, so feel free to pick it up if anyone is available. Thanks

TheRealSibasishBehera avatar Feb 09 '25 23:02 TheRealSibasishBehera

There is an implementation going in #8646 that could be relevant to some affected by this issue.

kaovilai avatar Feb 10 '25 21:02 kaovilai

https://github.com/vmware-tanzu/velero-plugin-for-csi/pull/238

I encountered such volumeSnapshotClass in the project

Maybe label velero.io/csi-volumesnapshot-class not exist,but annotation snapshot.storage.kubernetes.io/is-default-class exist

Image

hu-keyu avatar Feb 24 '25 08:02 hu-keyu

please review @anshulahuja98 @ywk253100

hu-keyu avatar Feb 24 '25 08:02 hu-keyu

@hu-keyu , the CSI plugin repo is not longer under use. CSI plugin code is merged in core velero, would suggest you to raise PR in main repo - https://github.com/vmware-tanzu/velero

anshulahuja98 avatar Feb 24 '25 08:02 anshulahuja98

@hu-keyu , the CSI plugin repo is not longer under use. CSI plugin code is merged in core velero, would suggest you to raise PR in main repo - https://github.com/vmware-tanzu/velero

ok

hu-keyu avatar Feb 24 '25 08:02 hu-keyu

@anshulahuja98

https://github.com/vmware-tanzu/velero/pull/8719

hu-keyu avatar Feb 24 '25 09:02 hu-keyu