vendir icon indicating copy to clipboard operation
vendir copied to clipboard

Feature Request: Support Flag to Disable Deletion

Open osterman opened this issue 5 years ago • 13 comments

use-case

We're using vendir in a number of different ways to vendor configuration files in from a centralized source.

  1. One way is to "layer" configurations from a centralized source. That is, we import a base of configurations and then add a few minor customizations in new files.
  2. The other way is we initialize configurations (then disable and diverge), while maintaining a record of when we divereged in the vendir.yml.

The problem is when we disable a configuration, then vendir sync wants to delete all the files. And if we add files into a vendir'd configuration, and then run vendir sync, it will delete those extraneous files. The current behavior makes sense, however, we're hoping we could introduce something that will address the use-cases above. We'll help implement it as well.

Request A

Support an enabled setting. In the example below, enabled is set to false, so the configuration is skipped. Running vendir sync will not attempt to modify or delete the local copy (it's as though the stanza were deleted from the file). This is useful for when we deliberately want to diverge, but want to keep a record in vendir.yml the point at which we diverged.

apiVersion: vendir.k14s.io/v1alpha1
kind: Config

directories:

  # Terraform components
  - path: components/terraform
    contents:

      - path: account-map
        # Disable this from running
        enabled: false
        git:
          url: https://github.com/cloudposse/terraform-aws-components
          ref: 0.137.0
        newRootPath: modules/account-map
        includePaths:
          - modules/account-map/**/*

      - path: account-settings
        git:
          url: https://github.com/cloudposse/terraform-aws-components
          ref: 0.137.0
        newRootPath: modules/account-settings
        includePaths:
          - modules/account-settings/**/*

Request B

Support an ignorePaths block which specifies paths that should not be deleted (or synchronized), even if they do not exist at the source. This is useful when we layer in configurations. For example, we add additional configuration files that do not exist at the source.

In the example below, the ignorePaths block says to ignore the fact that backend.tf.json does not exist at the source and also, ignore any files in the catalog/* folder.


apiVersion: vendir.k14s.io/v1alpha1
kind: Config

directories:

  # Terraform components
  - path: components/terraform
    contents:

      - path: account-map
        # Ignore files which are generated locally and should not be overridden or deleted.
        ignorePaths: &ignore
        - backend.tf.json
        - catalog/*
        git:
          url: https://github.com/cloudposse/terraform-aws-components
          ref: 0.137.0
        newRootPath: modules/account-map
        includePaths:
          - modules/account-map/**/*

      - path: account-settings
        # Use a YAML anchor to ignore common files
        <<*ignore
        git:
          url: https://github.com/cloudposse/terraform-aws-components
          ref: 0.137.0
        newRootPath: modules/account-settings
        includePaths:
          - modules/account-settings/**/*

Request C

Support a --skip-delete flag. on vendir sync. This is useful during development or as a precaution to avoid accidentally deleting any files.

Running vendir sync --skip-delete will run as usual, but not remove any files (but possibly overwrite).

osterman avatar Dec 01 '20 19:12 osterman

@aknysh on our team is standing by ready to contribute the changes if you agree to the additions.

osterman avatar Dec 01 '20 19:12 osterman

Hi @osterman!

Thanks for the feature requests! Before we make a decision on whether these features are something to implement, could you provide us with a few more details on the use cases you've described? Even a concrete example would be great! We would just like to get a sense for exactly what the goal is in each use case and how accomplishing that goal currently looks.

Eli

ewrenn8 avatar Dec 02 '20 20:12 ewrenn8

@ewrenn8 - sure thing! Here's our precise use-case.

We're using vendir as our tool for vendoring in terraform "root" modules (aka top-level modules), as well as vendoring in helmfiles for helmfile.

We maintain literally hundreds of terraform modules (more than any other company out there!), and dozens of helmfiles and these are leveraged in our forth-coming reference architecture. We rely on vendoring as a way to easily compose an infrastructure from our infrastructure library and various catalogs of configurations.

We've struggled for a long time with how to handle the vendoring. For example, Terraform supports some primitive vendoring with terraform init --from-module=... but we still have the question for how to manage helmfiles.

Vendir is the answer.

Except, well, for example when vendoring in root-modules in terraform that we call "components", we need a backend configuration. That backend configuration differs depending on the remote state storage (e.g. s3, tfc, vault, consul, etc). So the "backend.tf" must be generated as interpolation is not supported. We do this by generating a backend.tf.json file from a YAML structure we've defined. We use another tool called variant2 to generate workflows from that YAML definition, as part of our atmos utility.

The problem is using vendir with something like the vendir.yaml (example) will delete any of these *.tf.json files we generate in the vendir'd directory. True, we can re-generate them, but it's annoying. Also, there are other files (e.g. *.auto.tfvars) that can be overridden. We'd like to exclude those so it doesn't happen.

Hope that helps!

osterman avatar Dec 04 '20 08:12 osterman

@osterman For clarification, are the generated files generated from the assets that are synced via vendir? If so, how will you handle the potential that the assets which were used to generate the backend were changed during a sync?

ewrenn8 avatar Dec 08 '20 21:12 ewrenn8

@ewrenn8

If so, how will you handle the potential that the assets which were used to generate the backend were changed during a sync?

For our use-case, the backend.tf.json does not need to be regenerated based on the sync. Additionally, the file would never occur upstream.

But to solve this in the general sense, it could easily be handled by ignorePaths: (as in the example above)

    ignorePaths:
    - backend.tf.json
    - extra-config-file.yaml
    - conf.d/*

Vendir should ignorePaths here. Not sync it. Not delete it.

osterman avatar Dec 08 '20 23:12 osterman

Vendir should ignore anything here. Not sync it. Not delete it.

what would happen when upstream deleted a directory that happens to currently contain *.tf.json?

but to solve this in the general sense, it could easily be handled by ignore: (as in the example above)

the way i think about vendir is that it layers sources i.e. fetch source, then apply include/exclude paths. for your proposed use case i could imagine allowing additional layers from "existing" directory (e.g. copy over all files matching glob from existing directory to a directory that is being staged).

cppforlife avatar Dec 10 '20 19:12 cppforlife

I've updated my description to use terminology more consistent with vendir.

Current Directives

These features are all supported by vendir today:

  • includePaths specify what should be included. Any files outside of includePaths are deleted.
  • excludePaths are "placed" on top of include paths, and any files inside of excludePaths are deleted.
  • legalPaths are paths to files that need to be included for legal reasons such as LICENSE file

Proposed Directive

This is what we need:

  • ignorePaths are neither included nor excluded from remotes. Any files inside of ignorePaths are left alone and not deleted.

osterman avatar Dec 15 '20 07:12 osterman

looks like ignorePaths attribute is what we want.

aknysh avatar Dec 15 '20 18:12 aknysh

and to be truthful to the example description, it would be this right?

        ignorePaths: &ignore
        - **/backend.tf.json
        - catalog/**

to cover all nested backend.tf.json. what should happen if upstream removed a directory and we have backend.tf.json in our local copy? im assuming we would still copy it over (and in process ensure that directory is available).

(implementation for this would be to download everything from scratch as usual, but also copy over all paths that match given globs on top of new directories).

cppforlife avatar Dec 16 '20 01:12 cppforlife

@cppforlife yes, thanks for the correction in the example. That's more accurate.

Also, I was thinking about some analogs for comparison to other tools:

  • the functionality we're talking about is similar to the behavior of the .gitignore file. These files are not considered part of the git tree, just like ignorePaths should not be considered as part of the vendoring.
  • it is also a bit similar to rsync --exclude which will not copy the files from the upstream, but also won't delete the files from local unless the flag --delete-excluded is passed.

As for the implementation, @aknysh any thoughts on that?

osterman avatar Dec 16 '20 18:12 osterman

@ewrenn8 anything more we can help provide to make our case for this feature? Again, we're offering to implement it. =)

osterman avatar Jan 06 '21 00:01 osterman

hey @osterman and @aknysh. back from holidays... ive created this issue to capture what i believe we came from our discussion above: https://github.com/vmware-tanzu/carvel-vendir/issues/37. if that description and details make sense, i think it's unblocked to be implemented.

cppforlife avatar Jan 06 '21 18:01 cppforlife

hey @cppforlife did #37 fix this issue also? Or just was a pre-cursor to doing it? Looks like it's been quite a long time since an offer was made for implementation. So just checking in here

neil-hickey avatar Feb 22 '23 22:02 neil-hickey