terraform-provider-archive
terraform-provider-archive copied to clipboard
data.archive_file does not generate archive file during apply
Hi there,
looks like data.archive_file does not generate archive file during apply.
Terraform Version
Terraform version: 0.11.11
- provider.archive v1.1.0
Affected Resource(s)
- archive_file
Terraform Configuration Files
data "archive_file" "deployment_package" {
type = "zip"
source_dir = "../../example/"
output_path = ".${replace(path.module, path.root, "")}/tmp/example.zip"
}
Expected Behavior
Archive file is generated during terraform apply.
Actual Behavior
Archive file is not generated. However if I run terraform plan before apply, the output is generated.
Steps to Reproduce
Please list the steps required to reproduce the issue, for example:
-
terraform apply
Based on your like on https://github.com/terraform-providers/terraform-provider-archive/issues/3 I assume this is for the case where plan is executed and outputting a plan which is then applied from a clean environment.
We've also experienced this in a CI environment where plan and apply are separate stages and I can also simulate the issue with this code:
data "archive_file" "this" {
type = "zip"
output_path = "test.zip"
source_file = "a.txt"
}
resource "aws_s3_bucket_object" "this" {
bucket = "YOURBUCKETHERE"
key = "test.zip"
source = "test.zip"
}
and then running something along the lines of
terraform plan -out=tfplan
rm test.zip
terraform apply "tfplan"
Same issue in my case
still same issue.
Same issue here.
Seeing the same thing 0.12.17: when I change a file in the directory referenced below, terraform plan
doesn't pick up the change unless I taint aws_s3_bucket_object.cookbook
data "archive_file" "cookbook" {
output_path = "${path.module}/temp/cookbook.zip"
source_dir = "${path.module}/cookbook-archive"
type = "zip"
}
resource "aws_s3_bucket_object" "cookbook" {
bucket = module.cookbook.bucket_name
key = "cookbook.zip"
source = data.archive_file.cookbook.output_path
tags = {
ManagedBy = "Terraform"
}
}
I'm running plan via app.terraform.io
, so I assume it would generate the archive on every run and not cache it from a previous run.
I'm adding an additional workaround below.
If you don't know which files will change, I suggest something along the following:
data "external" "hash" {
program = ["bash", "${path.module}/scripts/shasum.sh", "${path.module}/configs", "${timestamp()}"]
}
data "archive_file" "main" {
type = "zip"
output_path = pathexpand("archive-${data.external.hash.result.shasum}.zip")
source_dir = pathexpand("${path.module}/configs")
}
output "archive_file_path" {
value = data.archive_file.main.output_path
}
where ${path.module}/configs
is the folder to archive. We pass timestamp()
to the first data
resource so that the hash is recomputed on every run.
The content of the shasum.sh
script is as follow (note that this will work only on UNIX based systems, so it won't work on Windows:
#!/bin/bash
FOLDER_PATH=${1%/}
SHASUM=$(shasum $FOLDER_PATH/* | shasum | awk '{print $1}')
echo -n "{\"shasum\":\"${SHASUM}\"}"
yea, a while after I posted my last comment, I came up with something like
locals {
source_dir = "${path.module}/cookbook-archive"
}
resource "random_uuid" "this" {
keepers = {
for filename in fileset(local.source_dir, "**/*"):
filename => filemd5("${local.source_dir}/${filename}")
}
}
data "archive_file" "cookbook" {
# threw the `/temp/` in there to gitignore it easier, but in hindsight it
# could be just as easy to gitignore `cookbook*.zip`
output_path = "${path.module}/temp/cookbook-${random_uuid.this.result}.zip"
source_dir = local.source_dir
type = "zip"
}
resource "aws_s3_bucket_object" "cookbook" {
bucket = module.cookbook.bucket_name
key = "cookbook.zip"
source = data.archive_file.cookbook.output_path
tags = {
ManagedBy = "Terraform"
}
}
(did this from memory, so it might not quite work as-is, but it should be close)
This is much better, thanks ! Maybe update your code so that it's valid (need a ',' line 3, and ${filename}"
line 4)
good catch, thanks! also dried it up a bit :)
Oops, just run into a weird thing with this code (seems like a provider error):
Error: Provider produced inconsistent final plan
When expanding the plan for module.slo-pipeline-cf-errors.random_uuid.hash to
include new values learned so far during apply, provider "random" produced an
invalid new value for .keepers["slo_config.json"]: was
cty.StringVal("d8073f7f8a404661c31a3cdf66ae6f8d"), but now
cty.StringVal("b42b077fe6dd6e3a57af845c5b0c6c0d").
This is a bug in the provider, which should be reported in the provider's own
issue tracker.
weird. I haven't run into that, but I've also only made one change, so maybe it'll bite me next time. Maybe try one of the other file hash methods? could be something weird about md5 on one of the systems involved?
Ah, it's because I'm dynamically adding a file (generated by TF) to my source directory, using the local_file
resource. Even with a depends_on = [local_file.main]
in the random_uuid.this
resource, it seems like the fileset
is executed before the file is dropped in the folder, thus confusing Terraform.
what if you added it explicitly somehow? something like:
resource "random_uuid" "this" {
keepers = {
localfile => md5(local_file.main.content)
for filename in fileset(local.source_dir, "**/*"):
filename => filemd5("${local.source_dir}/${filename}")
}
}
no idea if for loops work like that... :)
The tricks work indeed, but then, each time a new apply is made, the archive and all resources that depend on it (e.g. a lambda function) will be modified, even if the content of the lambda did not change. The Terraform code is then not idempotent anymore.
I ran into this on Terraform Cloud, also. It would be ideal if we could persist a single directory between the plan and apply phases (or, if archive_file
was smart enough to regenerate the archive during "apply" if it was missing)
If I create the initial zip file manually myself then the archive_file behavior on subsequent apply runs works fine for me -- using terraform version 0.12.28
Based on your like on #3 I assume this is for the case where plan is executed and outputting a plan which is then applied from a clean environment.
We've also experienced this in a CI environment where plan and apply are separate stages and I can also simulate the issue with this code:
data "archive_file" "this" { type = "zip" output_path = "test.zip" source_file = "a.txt" } resource "aws_s3_bucket_object" "this" { bucket = "YOURBUCKETHERE" key = "test.zip" source = "test.zip" }
and then running something along the lines of
terraform plan -out=tfplan rm test.zip terraform apply "tfplan"
This comment helped me solve my issue. I'm using terraform in a Gitlab CI pipeline with separate plan and apply stages. My apply stage would fail because the archive file was not found.
What's happening (and the comment above helped me understand) is the plan step is where the archive file is actually created. To make this work in my CI pipeline, I added config to cache the files created by the plan stage and make them available to the apply stage.
I'd recommend changing the archive provider to produce the zip file during apply instead of plan. This would match with how I think about Terraform working. At a minimum, the docs for the archive provider should be updated to make it clear when Terraform creates the archive file.
how the hell did they manage to mess up a goddamn zip command
this solution worked for me adding source code hash https://github.com/hashicorp/terraform/issues/8344#issuecomment-265548941
Having the exact same issue in our Gitlab CI pipeline. We couldn't use artifacts since we have many zips and it might just upload sensitive data to Gitlab. As a workaround, we are obliged to rerun terraform plan in the apply step just to create the zip file.
EDIT: According to this bit of documentation, you can defer the creation of the archive file until some resource is applied (ie. in the terraform apply step). One can imagine something like this, which also works as a workaround:
data "archive_file" "zip" {
type = "zip"
source_file = "${path.module}/textfile.txt"
output_path = "${path.module}/myfile.zip"
depends_on = [
random_string.r
]
}
resource "random_string" "r" {
length = 16
special = false
}
or something like this, which has an equivalent dependency graph:
data "archive_file" "zip" {
type = "zip"
source_file = "${path.module}/textfile.txt"
output_path = "${path.module}/myfile-${random_string.r.result}.zip"
}
resource "random_string" "r" {
length = 16
special = false
}
I just ran into this issue in Gitlab as well
I managed to tweak @amine250 's solution to get it working. The random string does not work as it will already determine it in the plan phase it seems. Hence, I used a null resource that is triggered by a timestamp as mentioned here.
The downside of this approach is that even when the underlying files haven't changed, it will trigger and update. In my case this works out nicely as I'm using this to deploy a Cloud Function (GCP), which will not redeploy when there are no changes (the zipfile I upload to Cloud Storage has a hash in its name).
Note that using a the null-resource directly on the archive resource and triggering the null resource with a hash of the 2 file contents does not work.
# Dummy resource to ensure archive is created at apply stage
resource null_resource dummy_trigger {
triggers = {
timestamp = timestamp()
}
}
data "local_file" "py_main" {
filename = "${path.root}/../../../../cloud_function/main.py"
depends_on = [
# Make sure archive is created in apply stage
null_resource.dummy_trigger
]
}
data "local_file" "py_req" {
filename = "${path.root}/../../../../cloud_function/requirements.txt"
depends_on = [
# Make sure archive is created in apply stage
null_resource.dummy_trigger
]
}
data "archive_file" "cf_zip" {
type = "zip"
output_path = "${path.root}/../../../../tmp/cf.zip"
source {
content = data.local_file.py_main.content
filename = "main.py"
}
source {
content = data.local_file.py_req.content
filename = "requirements.txt"
}
}
I also run into the same issue in Gitlab, and the resource.random_string did not work, but resource.null_resource work. Thanks!
Is there a follow up on this? I was here one year ago, this behaviour still occurs
Wanted to leave a warning for anyone considering the suggestion:
If I create the initial zip file manually myself then the archive_file behavior on subsequent apply runs works fine for me -- using terraform version 0.12.28
I tested this out and it does not work. It simply unbreaks the apply by putting an old version of the zip file there.
test.tf:
data "archive_file" "api" {
type = "zip"
source_dir = "${path.module}/test_files/"
output_path = "${path.module}/test.zip"
excludes = ["__pycache__"]
}
resource "local_file" "zip_sha" {
content = data.archive_file.api.output_sha
filename = "${path.module}/test_sha.txt"
}
Taking an old copy of the zip file with sha , and running the following shows that we end up with the old version of the zip file present during apply.
cp old_test.zip test.zip
terraform plan -out=tfplan
cp old_test.zip test.zip
terraform apply "tfplan"
cat test_sha.txt
> 33585fa47331712f37d9206c3587b6a1380db53b
shasum test.zip
> 0dd4eb3e0f51b5f659c991d1ff93ef5d2c1cc2a0 test.zip
Based on your like on #3 I assume this is for the case where plan is executed and outputting a plan which is then applied from a clean environment. We've also experienced this in a CI environment where plan and apply are separate stages and I can also simulate the issue with this code:
data "archive_file" "this" { type = "zip" output_path = "test.zip" source_file = "a.txt" } resource "aws_s3_bucket_object" "this" { bucket = "YOURBUCKETHERE" key = "test.zip" source = "test.zip" }
and then running something along the lines of
terraform plan -out=tfplan rm test.zip terraform apply "tfplan"
This comment helped me solve my issue. I'm using terraform in a Gitlab CI pipeline with separate plan and apply stages. My apply stage would fail because the archive file was not found.
What's happening (and the comment above helped me understand) is the plan step is where the archive file is actually created. To make this work in my CI pipeline, I added config to cache the files created by the plan stage and make them available to the apply stage.
I'd recommend changing the archive provider to produce the zip file during apply instead of plan. This would match with how I think about Terraform working. At a minimum, the docs for the archive provider should be updated to make it clear when Terraform creates the archive file.
Knowing this helped solve my pipeline problem where I would also plan, then apply in separate gitlab pipeline stages. So the apply would attempt to upload the lambda zip files, which were generated in the plan stage and it would fail. So just adding in the plan stage, the zip folder to the artifacts of the stage, meant it was fixed and working in the apply stage
I don't know why the planning stage is being used to generate zip files, planning should just be about making the plan file, applying should be about creating things and doing actions. It seems wrong to do it in the plan stage. As other people have commented
Got hit by that problem, and I also solved it using https://github.com/hashicorp/terraform-provider-archive/issues/39#issuecomment-815021702
Works well (but it would be nice if the Terraform doc contained more borderline examples like this... )
https://github.com/hashicorp/terraform-provider-archive/issues/39#issuecomment-815021702
also did the trick for me
The archive_file
artifacts are produced during the plan stage. You just need to pass the artifacts across the stages.
For instance, for Gitlab CI:
image:
name: hashicorp/terraform:1.1.9
entrypoint:
- '/usr/bin/env'
- 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
variables:
PLAN: "plan.tfplan"
TF_IN_AUTOMATION: "true"
.terraform_before_script:
- terraform --version
# Ensure directory for lambda function zip files exists
- install -d lambda_output
- terraform init -input=false
stages:
- plan
- deploy
plan:
stage: plan
before_script: !reference [.terraform_before_script]
script:
- terraform plan -out=$PLAN -input=false
artifacts:
name: plan
paths:
- $PLAN
- lambda_output
deploy:
stage: deploy
before_script: !reference [.terraform_before_script]
script:
- terraform apply -input=false $PLAN
dependencies:
- plan
Then, in your Terraform file:
data "archive_file" "function" {
type = "zip"
source_dir = "${path.root}/lambda/function"
output_path = "${path.root}/lambda_output/function.zip"
}
The
archive_file
artifacts are produced during the plan stage. You just need to pass the artifacts across the stages.
FYI, it's not recommended to store plan files as artifacts because it might contain sensitive data and is not encrypted.