terraform-provider-archive icon indicating copy to clipboard operation
terraform-provider-archive copied to clipboard

Support glob paths in archive_file data source excludes

Open mbamber opened this issue 4 years ago • 8 comments

It would be really handy if data.archive_file supported glob matching for excludes. This would enable configurations such as:

data "archive_file" "zip" {
  type        = "zip"
  output_path = "./myzip.zip"
  source_dir  = "."
  excludes    = ["**/.git"]
}

in order to exclude all .git directories from the archive.

The glob matching should ideally support the well known ** syntax meaning 0 or more directories

Affected Resource(s)

  • data.archive_file

mbamber avatar Feb 28 '20 08:02 mbamber

Just to add some context, glob paths would be perfect for ignoring things like node_modules. In my case, I am using this provider to create zip files for App Engine deployments and ideally don't want to include the local version of node_modules when deploying. Manually specifying each file to exclude isn't really feasible in this case 😞

tombailey avatar Jan 16 '21 16:01 tombailey

This would be really useful. My current use case is packaging some python code for a lambda. Archiving the code is the only thing that is left out and only because I cannot exclude directories when creating the package from my repository root directory.

giannimassi avatar Feb 16 '21 11:02 giannimassi

Any updates on this?

We would definitely need something like this too. We are having exactly the same problem, where we do not want to include local node_modules version, among other things. We've thought about a workaround on creating a temporal directory where we place all needed files, and then zip that. But this glob paths support would highly simplify the process.

panoc1 avatar Dec 02 '21 08:12 panoc1

As a work around I've been using:

 = setunion(
    ["package.json", "package-lock.json","README.md"],
    fileset("${path.module}/../../../handlers/example", "node_modules/**")
  )

It does makes smaller archives, but creating the archive takes way longer due to the exclude array becoming massive.

Having glob/doublestar built it would allow the best of both worlds; faster archiving and smaller archives

willfarrell avatar Mar 10 '22 06:03 willfarrell

I don't mean to necro-bump this issue, but I landed here trying to solve similar problems. I ended up implementing a script which can be used via the external data source which will archive a file and supports globbing patterns. I'm sure it's not perfect, but it's working for me and I thought it may help others who end up here from Google.

archive.py

Example of usage:

data "external" "archive" {
  program = ["python3", "${path.module}/external/archive.py"]
  
  query = {
    output_path = "./myarchive.zip"
    source_path = "${path.module}/some/important/directory/"
    
    # Exclusions match **file** names not directories
    exclusions = jsonencode([
      # These match relative to the source path
      ".git/**",
      "env/**",
      "node_modules/**",
      # This could be anywhere underneath the source path
      "**/__pycache__/**",
      "**/*.pyc",
    ])
  }
}

calebstewart avatar Aug 29 '22 04:08 calebstewart

This feature seems like its a no-brainer to add.

vyacheslav31 avatar Apr 25 '23 21:04 vyacheslav31

As a temp solution, I implemented a custom filtered_archive module that gives this functionality. Usage:

module "layer_archive" {
  source     = "[email protected]:asaf-kali/resources//tf/filtered_archive"
  source_dir = "my_layer/"
  name       = "layer"
}

resource "aws_lambda_layer_version" "dependencies_layer" {
  layer_name       = "my-layer"
  filename         = module.layer_archive.output_path
  source_code_hash = filebase64sha256(module.layer_archive.output_path)
  depends_on       = [
    module.layer_archive,
  ]
}

It also supports exclude_patterns variable (list of strings).

asaf-kali avatar May 12 '23 20:05 asaf-kali

Ain't no way this is still open. 💀

vantaboard avatar Feb 01 '24 01:02 vantaboard