velero icon indicating copy to clipboard operation
velero copied to clipboard

Support cluster resource Includes/Excludes

Open 27149chen opened this issue 2 years ago • 21 comments

Describe the problem/challenge you have

The current filter (IncludedResources/ExcludedResources + IncludeClusterResources flag) is not enough for some special cases, such as:

  1. all namespaced resources + some kind of cluster resource
  2. all namespaced resources + cluster resource excludes
  3. no namespaced resources + all cluster resources

Describe the solution you'd like

Add new field IncludedClusterResources and ExcludedClusterResources, examples: (wildcard rules: includeResources: empty for all, excludeResources/includeClusterResources: "*" for all, excludeResources/includeClusterResources/excludeClusterResources: empty for nothing)

  1. no namespaced resources + no cluster resources:
excludeResources: ["*"]
  1. no namespaced resources + some cluster resources:
excludeResources: ["*"]
includeClusterResources: ["persistentvolumes"]
excludeResources: ["*"]
excludeClusterResources: ["persistentvolumes"]
  1. no namespaced resources + all cluster resources
excludeResources: ["*"]
includeClusterResources: ["*"]
  1. some namespaced resources + no cluster resources:
includeResources: ["configmaps"]
excludeResources: ["configmaps"]
  1. some namespaced resources + some cluster resources:
includeResources: ["configmaps"]
includeClusterResources: ["persistentvolumes"]
excludeResources: ["configmaps"]
includeClusterResources: ["persistentvolumes"]
includeResources: ["configmaps"]
excludeClusterResources: ["persistentvolumes"]
excludeResources: ["configmaps"]
excludeClusterResources: ["persistentvolumes"]
  1. some namespaced resources + all cluster resources:
includeResources: ["configmaps"]
includeClusterResources: ["*"]
excludeResources: ["configmaps"]
includeClusterResources: ["*"]
  1. all namespaced resources + no cluster resources
includeResources:
  1. all namespaced resources + some cluster resources:
includeClusterResources: ["persistentvolumes"]
excludeClusterResources: ["persistentvolumes"]
  1. all namespaced resources + all cluster resources:
includeClusterResources: ["*"]

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Velero version (use velero version):
  • Kubernetes version (use kubectl version):
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • :+1: for "The project would be better with this feature added"
  • :-1: for "This feature will not enhance the project in a meaningful way"

27149chen avatar Jul 14 '22 09:07 27149chen

Agreed that current combination of include/exclude filters of namespace resources and cluster resources causes confusion, IMO, ideally we should clarify that current include/exclude resource filters will only apply to namespace resources and we'll introduce new include/exclude cluster-resource filters for filtering cluster resources

reasonerjt avatar Jul 18 '22 03:07 reasonerjt

Actually, I think the current excludes/excludes do apply to cluster-scoped resources. If you include PersistentVolumes in excludedResources, then I don't think any will be backed up. If you have an includedResources list with both cluster-scoped and namespaced resources, then both sets will be included (but nothing else). The issue here is that once you have an includes list, then nothing outside the list is included. So you can't say "include all namespaced resources but only include X, Y, and Z cluster-scoped resources."

It's worth noting that if lncludeClusterResources is nil (rather than true or false) then Velero will pull in certain resources as-needed. In particular, if your main use case for cluster-scoped resources is persistentVolumes, then you probably don't want to include all PVs, just the ones associated with PVCs you're backing up, so what you want is to set includeClusterResources to nil (or leave the field out of spec entirely, which is equivalent).

sseago avatar Jul 18 '22 20:07 sseago

so what you want is to set includeClusterResources to nil (or leave the field out of spec entirely, which is equivalent).

@sseago The problem is that I do want to set includeClusterResources to true since I need some cluster resources. And the current filter (IncludedResources/ExcludedResources + IncludeClusterResources flag) can not meet most of the cases I mentioned above

27149chen avatar Jul 19 '22 03:07 27149chen

@27149chen I agree with @sseago. --include-resources and --exclude-resources are not limited to namespace resource.

Usually, Velero backup will not separate resources into namespace and cluster scope. Taking backup resources in namespace test as an example, velero backup create backup01 --include-namespaces=test this command will backup all namespace resources in namespace test, and related cluster scope resources are included too.

What is the background of the special use cases you mentioned?

  • all namespaced resources + some kind of cluster resource
  • all namespaced resources + cluster resource excludes
  • no namespaced resources + all cluster resources

blackpiglet avatar Jul 19 '22 06:07 blackpiglet

related cluster scope resources are included too

it does not meet our requirement, we want to use includeClusterResources flag (according to you words, this is a "bad" flag which I should never use it, otherwise, some unexpected behavior will occur. If so, what's the point of its existence?)

27149chen avatar Jul 19 '22 10:07 27149chen

What is the background of the special use cases you mentioned?

most of them are valid requirements from our customer

27149chen avatar Jul 19 '22 10:07 27149chen

@27149chen IMO, there are ways to do the similar effect for some of the cases.

  • all namespaced resources + some kind of cluster resource: may not have an easy way to achieve this by current parameters. Need to use --excluded-resources parameter to list all not included resources.
  • all namespaced resources + cluster resource excludes: backup all things, except the specified cluster resources. --excluded-resources=<exclude-cluster-resources-list>.
  • no namespaced resources + all cluster resources: create an empty namespace, and use the two parameters in backup command --include-namespaces=<empty-namespace-name> --include-cluster-resources=true.

By my understanding, the missing piece is a parameter to specify resources to include, and it will not exclude all the other things as the current parameter --included-resources.

blackpiglet avatar Jul 19 '22 12:07 blackpiglet

So setting includeClusterResources to nil does include some cluster resources, but it's limited to those resources that are directly relevant to other resources being backed up (this handles the usual needs for PVs, CRDs, etc.) If your use case is "I want all of cluster resources X, Y, and Z but none of A, B, and C" then you can handle that right now by just setting includeClusterResources to true and adding A, B, and C to excludedResources.

As @blackpiglet said, the only thing you can't do right now directly is "only include X, Y, and Z cluster resources" unless you're willing/able to list everything not A, B, and C in excludeResources. Will the "list all cluster resources you don't want in the exclude list" meet your needs?

sseago avatar Jul 19 '22 15:07 sseago

As for "no namespaced resources but all cluster resources", I wonder whether we can handle this by just excluding all namespaces while seting includeClusterResources to true. I know that wildcards don't currently work on exclude, and the issue is an empty include list is equivalent to "*" -- what's less clear to me is whether it would work to have includedNamespaces=["foo"] and also have excludedNamespaces=["foo"] -- this might mean that the full list of namespaces is ["foo"] - ["foo"] == [], but I haven't tested that scenario myself.

sseago avatar Jul 19 '22 15:07 sseago

@sseago @blackpiglet , are we talking about workarounds or a formal solution?

27149chen avatar Jul 20 '22 02:07 27149chen

So setting includeClusterResources to nil does include some cluster resources, but it's limited to those resources that are directly relevant to other resources being backed up (this handles the usual needs for PVs, CRDs, etc.) If your use case is "I want all of cluster resources X, Y, and Z but none of A, B, and C" then you can handle that right now by just setting includeClusterResources to true and adding A, B, and C to excludedResources.

As @blackpiglet said, the only thing you can't do right now directly is "only include X, Y, and Z cluster resources" unless you're willing/able to list everything not A, B, and C in excludeResources. Will the "list all cluster resources you don't want in the exclude list" meet your needs?

@sseago sorry but I think you do not understand my problem, if includeClusterResources is set to true, all pvs will be backed up, not just the relevant ones, see details in https://github.com/vmware-tanzu/velero/issues/5119

27149chen avatar Jul 20 '22 02:07 27149chen

namespaces is ["foo"] - ["foo"] == [], but I haven't tested that scenario myself.

@sseago please test it first before adding this comment, AFAIK, it is a invalid config if an item is both in includes and excludes, and an error will be raised

27149chen avatar Jul 20 '22 02:07 27149chen

@27149chen We all see your cases' value, and try to find out a way reasonable for all of us. Just my opinion, IncludedResources/ExcludedResources don't separate resource by two groups depending on whether they are limited in namespace, so there is no need to add two new parameters focusing on cluster-scope things. I also see there is gap for your need. all namespaced resources + some kind of cluster resource: may not have an easy way to achieve this by current parameters. Need to use --excluded-resources parameter to list all not included resources. If either adding a new flag that can specify including a list of resources without excluding all the others or change the behavior of IncludedResources parameter, but don't add two parameters as you proposed, does that work for you?

blackpiglet avatar Jul 20 '22 07:07 blackpiglet

@blackpiglet I'm happy if you can find a way to resolve this problem without adding two parameters for cluster resources.

I have listed 16 cases in my issue description, and provided a way that I think is simple and effective. If you can find another way to do it clean and simple, please give your examples.

Thanks.

27149chen avatar Jul 20 '22 12:07 27149chen

So one thing to consider -- if we were to implement your proposal, we'd need a different name. Your proposal doesn't account for the nil value of includeClusterResources -- i.e. only include relevant resources, vs. none vs. all. I think we would still need that even if we added some notion of additional filtering/including. For example, your proposal loses the abiliy to "only include PVs that are needed" So the include list would need to have a name other than the existing bool pointer "includeClusterResources" which has 3 valid values now -- true, false, and nil.

Also, I've seen other requests to allow for wildcards on the exclude list. If implemented, this would allow for "no namespaced resources" by simply specifying "*" for excludedNamespaces. That would handle cases 1-3.

Case 4 is already handled by setting includeClusterResources to false

For case 5, examples 1 and 4 already work today by setting includeClusterResources to true and putting both namespaced and cluster resources in the include or exclude list. 2 and 3 are trickier:

excludeResources: ["configmaps"]
includeClusterResources: ["persistentvolumes"]
includeResources: ["configmaps"]
excludeClusterResources: ["persistentvolumes"]

The only way to do the above right now would be to reformat this into a list of all excludes or all includes. But how central is this use case? I imagine someone else might say "I want per-namespace include/exclude lists" -- but once we get to that level of granularity of control, the recommendation would be to create two backups. I wonder whether the same would work here -- a cluster-scoped backup and a namespaced backup.

6 and 8 are similar. The second example can be done today by just using the existing include/exclude lists. The first would need 2 backups.

7 and 9 are supported by the existing options.

So out of the 16 total examples, it looks like 8 are supported today. 4 more would be supported by allowing for wildcards in excludeNamespaces, and 4 could be done today by using a partitioned backup scheme, but to do in a single backup would require two additional config params (but without removing/changing the existing true/false/nil includeClusterResources parameter).

The concern here is that the cluster/namespaced partition in the config doesn't really feel natural to me. If we were going to redesign the selection mechanism to be more flexible than it is today, I'm not certain that this is the only use case we'd care about adding. I'd want a more flexible config that might be less confusing than some combination of include/excluderesources, includeClusterResources true/false/nil, and two new cluster-scoped include/exclude parameters.

sseago avatar Jul 20 '22 15:07 sseago

@sseago please give you resource filter spec for every case as I did in the description so that we can discuss one by one and find the gap

27149chen avatar Jul 23 '22 13:07 27149chen

I'd like to add the case "all resources in a namespace" + "some selected cluster resources", which seems to not be possible ATM. Well one can run two separate backups, ok, but ideally it should be possible in the same backup.

waldner avatar Aug 02 '22 18:08 waldner

Also, if I can suggest, a filter based on resource names (not labels) would be most welcome, at least for my use cases.

waldner avatar Aug 03 '22 11:08 waldner

Also, if I can suggest, a filter based on resource names (not labels) would be most welcome, at least for my use cases.

@waldner you are talking about https://github.com/vmware-tanzu/velero/issues/5118 , which is also important to me

27149chen avatar Aug 03 '22 16:08 27149chen

Thanks! I subscribed to that one.

waldner avatar Aug 03 '22 17:08 waldner

After discussion, I think it's reasonable to create a new filtering parameter to focus on namespace resources, because currently there are three ways to filtering resources:

  • Label based filter: OrLabelSelectors and LabelSelector.
  • Resource type based filter: IncludedResources, ExcludedResources and IncludeClusterResources.
  • Namespace based filter: IncludedNamespaces and ExcludedNamespaces. Label based filter currently only applied to namespaced resources, so it can be regards as namespace based filter.

There are senarios that resource type based filter and namespace based filter interfere mutually. Seperate --include-resources and --exclude-resources into cluster scope and namespace scope parameters can resolve these issues.

blackpiglet avatar Sep 13 '22 07:09 blackpiglet