velero
velero copied to clipboard
Need a way to skip backup of volumes based on type
Describe the problem/challenge you have When taking a backup, Velero tries to all pod volumes available. Some volumes can be too large or backed up using other solutions. It is desirable to skip backup of volumes based on its type.
Describe the solution you'd like A way to configure the backup to skip certain types of volumes like NFS.
Anything else you would like to add: None
Environment:
- Velero version (use
velero version
): 1.8.1 - Kubernetes version (use
kubectl version
): 1.22 - Kubernetes installer & version:
- Cloud provider or hardware configuration: vSphere
- OS (e.g. from
/etc/os-release
): Photon
Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
- :+1: for "The project would be better with this feature added"
- :-1: for "This feature will not enhance the project in a meaningful way"
@gsadhani Could you clarify why the opt-out approach can't solve the problem? Is it b/c you don't want the user to update his workload to skip the backup
@gsadhani Could you clarify why the opt-out approach can't solve the problem? Is it b/c you don't want the user to update his workload to skip the backup
@reasonerjt yes, identify each pod in the workload containing the specific type of volume and annotating them could be cumbersome. It would be very convenient if we could skip the volumes based on volume type which can be specified as a backup configuration parameter.
there are many factors could differentiate volume type : storage class, or bigger, or nfs, or specific drivers. is it support to add the label velero.io/exclude-from-backup=true to mark the pv as excluded?
We can consider K8S supported volume types only for this scenario https://kubernetes.io/docs/concepts/storage/volumes/. Also, initially we can start with allowed skip volume type list for limited volume types. Usually some of the volume types can be backed using storage systems like nfs, fc, iscsi OR ephemeral/non-persistent volumes like emptyDir. hostPath, downwardAPI, local etc.
This will reduce manual effort for annotating all pods. If there is autoscaling defined for pods, user need to find way to add annotation automatically.
found more SKIP backup requirements, we should consider all cases and comeup with a generic way to handle skip:
https://github.com/vmware-tanzu/velero/issues/957 skip restic backup/restore: not backup the PV data, but backup the cluster state. https://github.com/vmware-tanzu/velero/issues/4129 Exclude folders for backing up https://github.com/vmware-tanzu/velero/issues/2413 Namespace skip (this could be enhanced as separate enhancement)
In a scenario where DBS excludes the Pods with volume from backup, then Pods will get stuck in pending state during the recovery phase. What is the better approach to handle and avoid manual effort? DBS tries to reduce the risk of failure or delay in backup & restore that occurred due to volume related workloads. E.g. 80% workloads running without volume and 20% Pods running with volume. During restore this 20% becomes risk for 80% of the workload recovery. The current approach the account team suggested is to create empty PV manually. We exclude all of the PVs while taking velero backup and create empty PV manually for the 20% otherwise, opt-in to backup the PV selectively.
so in summary, the requirements of skip:
- option to skip all PV data
- option to skip specified PV type (inline, nfs)
- option to skip specified PV size
- option to skip folders ( I think this could be lower priority than others given we will evolve to snapshot/cbt based backup)
I understand the requirement that we may want to introduce a way for user to skip backup of certain PV but not having to update the spec of the pods.
However, I'm not sure if skipping backup PV based on type is a good idea, this may be handy for a certain user but is it generic enough that users have the similar usage pattern that for some type
of PV they have a separate backup process and velero does not need to handle them? Is it normal he has separate backup process for two of the nfs PVs but for other nfs PVs he still needs velero's help?
I think skipping PVs based on type is do-able but we should avoid adding features that are only feasible to a very small group of users.
it does not have to be a specific type defined here: https://kubernetes.io/docs/concepts/storage/volumes/
typical user story:
- as a platform operator, i know what backend storage i am using, so i want an easy way to skip the certain type storage being backup by velero.
Would it be better by using name of StorageClass? Platform operator should know well how many storage class they offer to applications. That way, we do not worry about the specific type of PV, and also give the flexility to admin.
A StorageClass provides a way for administrators to describe the "classes" of storage they offer. Different classes might map to quality-of-service levels, or to backup policies, or to arbitrary policies determined by the cluster administrators. Kubernetes itself is unopinionated about what classes represent. This concept is sometimes called "profiles" in other storage systems.
@euclidsun In our case, some PVCs are only used for logging while another PVCs are for DBs and configurations. I think It's difficult to only classify them by StorageClass
. Personally I think we may need a more flexible way from velero point of view, such as included/excluded arrays to have PVC/volume (in-tree) names or kind of "GVNR" style resource selector.
Then upper application can configure or calculate required volume lists to velero.
@blackpiglet Let's start the design in 1.10 timeframe
By far, I thought out of two scenarios of skipping volume backing up.
- Volumes are mounted multiple times in a backup. It should be handled only once. - I think this will reduce the work for opt-out volumes like NFS, when the same volume mounted many times.
- "backup.velero.io/backup-volumes" and "backup.velero.io/backup-volumes-excludes" should be handled on controller level, then pod created from controller scaling can also be handled.
I think filtering volume by type also has some value, but there may be not many suitable use cases.
This is useful when volume's dataSource is not PVC.
For example, if customer want to skip all emptyDir
type of volumes, this would work. For nfs
volumes, not sure whether customer using NFS volume by this way.
Filtering by StorageClass name should be useful for PVC case.
Another scenario is stated here https://github.com/vmware-tanzu/velero/issues/957#issuecomment-1231643187. Support only backing up one replica's volumes in a controller case.
By far, I thought out of two scenarios of skipping volume backing up.
- Volumes are mounted multiple times in a backup. It should be handled only once. - I think this will reduce the work for opt-out volumes like NFS, when the same volume mounted many times.
- "backup.velero.io/backup-volumes" and "backup.velero.io/backup-volumes-excludes" should be handled on controller level, then pod created from controller scaling can also be handled.
Another scenario is stated here #957 (comment). Support only backing up one replica's volumes in a controller case.
@gsadhani @pradeepkchaturvedi @reasonerjt What's your opinion on this proposals?
I think the higher priority is to design a mechanism and format of the filter in bakup CR so user can skip the PVs during backup, we will also need to decide if this filter applies to restic only or both restic and snapshot.
We should also consider the case when the filter conflicts with the opt-in/out
annotations.
A design proposal will be drafted in v1.10 timeframe.
@reasonerjt It was mentioned on yesterday's community call that we may want to also use the filter for deciding which PVs to back up with restic/kopia vs. snapshots.
@sseago Does that mean this filter is wanted to backup PVs by both uploader and snapshot?
@blackpiglet I'm not sure what design/UX is best here -- separate filters for "what back up via kopia/restic" vs. "what to back up via snapshot", or one filter for "include these volumes", and another (depending on opt-in vs. opt-out for kopia) for "among the volumes backed up, these use kopia". If we're not careful, this can get very confusing for users. But it was brought up on the call this week that some users wanted a way to identify restic/kopia vs. csi/snapshot volumes without having to individually annotate one category or the other.
@sseago Thanks for clarification. I will try to unify both in design.
There is some more requirements related to this topic from #5340.
close it for the implement pr is merged