pulumi-aws icon indicating copy to clipboard operation
pulumi-aws copied to clipboard

InvalidParameterValueException: Uploaded file must be a non-empty zip

Open t0yv0 opened this issue 1 year ago • 14 comments

What happened?

From @komalali

Hi! I have noticed a recurring error as I've worked on drift detection.

Context:

  • I have a stack that uses the aws-serverless template: stack
  • It is deployed in the dev-sandbox account, so every night a number of these resources are cleaned up
  • The following day, when I run drift detection and remediation, the first remediation fails. link
aws:lambda:Function (fn):
    error: 1 error occurred:
    	* creating Lambda Function (fn-a91d018): operation error Lambda: CreateFunction, https response error StatusCode: 400, RequestID: b2589290-ab14-4526-a65c-3b90fabeb5e2, InvalidParameterValueException: Uploaded file must be a non-empty zip

The following drift run/remediation succeeds.

Example

The source is very simple:

https://github.com/pulumi/deploy-demos/blob/main/pulumi-programs/drift-test/index.ts

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";
import * as apigateway from "@pulumi/aws-apigateway";

// A Lambda function to invoke
const fn = new aws.lambda.CallbackFunction("fn", {
    callback: async (ev, ctx) => {
        return {
            statusCode: 200,
            body: new Date().toISOString(),
        };
    }
})

// A REST API to route requests to HTML content and the Lambda function
const api = new apigateway.RestAPI("api", {
    routes: [
        { path: "/", localPath: "www"},
        { path: "/date", method: "GET", eventHandler: fn },
    ]
});

// The URL at which the REST API will be served.
export const url = api.url;

Output of pulumi about

$ pulumi about 
 Logging in using access token from PULUMI_ACCESS_TOKEN 
 CLI           
 Version      3.109.0 
 Go Version   go1.22.0 
 Go Compiler  gc 
  
 Plugins 
 NAME            VERSION 
 aws             6.25.1 
 aws-apigateway  2.4.0 
 awsx            2.5.0 
 docker          4.5.1 
 docker          3.6.1 
 nodejs          unknown 
  
 Host      
 OS       debian 
 Version  11.9 
 Arch     x86_64 
  
 This project is written in nodejs: executable='/usr/bin/node' version='v18.17.1' 
  
 Backend         
 Name           pulumi.com 
 URL            https://app.pulumi.com/komal-pulumi-corp 
 User           komal-pulumi-corp 
 Organizations  komal-pulumi-corp, service-provider-test-org, komal-testing-123, pulumi-test, pulumi 
 Token type     personal 
  
 Dependencies: 
 NAME                    VERSION 
 @pulumi/aws-apigateway  2.4.0 
 @pulumi/aws             6.25.1 
 @pulumi/awsx            2.5.0 
 @pulumi/pulumi          3.109.0 
 @types/node             18.19.24 
 typescript              4.9.5 
  
 Pulumi locates its logs in /tmp by default 
 warning: Failed to get information about the current stack: No current stack

Additional context

We tried a few times to reproduce locally but failed to do so, while it reproduces pretty consistently with deployments. This seems quite surprising. There's a bit going about drift detection but when all is said and done deployments is simply performing this operation over a state that tracks a lambda and a cloud that has the lambda removed:

pulumi update --refresh --skip-preview --yes

Pulumi refreshes and drops the lambda from state, then decides to Create the lambda, which should succeed but does not.

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

t0yv0 avatar Mar 13 '24 20:03 t0yv0

From @komalali : okay I have some new information. I ran 2 separate deployments, pulumi refresh and pulumi update instead of 1 remediate-drift which does the pulumi update --refresh. When I run them as 2 separate deployments I don't get the error.

t0yv0 avatar Mar 13 '24 20:03 t0yv0

using the automation API for go as well:

<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0 stdout: Updating (dev.ap-northeast-1.firehose):
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0   pulumi:pulumi:Stack: (same)
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0     [urn=urn:pulumi:dev.ap-northeast-1.firehose::<redacted>::pulumi:pulumi:Stack::<redacted>-dev.ap-northeast-1.firehose]
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0     > pulumi:pulumi:StackReference: (read)
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0         [id=dev.us-west-2.global]
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0         [urn=urn:pulumi:dev.ap-northeast-1.firehose::<redacted>::pulumi:pulumi:StackReference::global]
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0         name: "dev.us-west-2.global"
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0     ~ aws:lambda/function:Function: (update)
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0         [id=dev-<redacted>-processor-365fe5e]
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0         [urn=urn:pulumi:dev.ap-northeast-1.firehose::<redacted>::aws:lambda/function:Function::dev-<redacted>-processor]
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0         [provider=urn:pulumi:dev.ap-northeast-1.firehose::<redacted>::pulumi:providers:aws::firehose-ap-northeast-1::ea66edb7-3c1c-4604-b1dc-42030393052b]
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0       - code: archive(assets:3e11bfc) {
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0         }
<redacted>-release-scheduler cnp1sl6p1qbs70d46850/main/479e594bcbaf4d3b9f3249417b21f8a0       + code: archive(file:10e3b68) { /app/downloads/lc-firehose-lambda.zip }
* updating urn:pulumi:dev.ap-northeast-1.firehose::<redacted>::aws:lambda/function:Function::<redacted>-processor: 1 error occurred:
* updating Lambda Function (dev-<redacted>-365fe5e) code: operation error Lambda: UpdateFunctionCode, https response error StatusCode: 400, RequestID: 6f2b719d-8766-4e14-b8d5-d2367a98dfa2, InvalidParameterValueException: Uploaded file must be a non-empty zip

seems to work intermittently and can't really determine whats causing it

pulumi: 3.108.0
aws: 6.24.0

ryanpodonnell1 avatar Mar 13 '24 21:03 ryanpodonnell1

I opened this a couple weeks back and it appeared resolved but is rearing its ugly head again https://github.com/pulumi/pulumi-aws/issues/3478

ryanpodonnell1 avatar Mar 13 '24 21:03 ryanpodonnell1

I am also struggling with this error happening intermittently. I have not been able to replicate it yet. I am using a zip file that's generated with https://www.serverless.com/.

  lambda_forward_logs:
    type: aws:lambda:Function
    properties:
      code:
        fn::fileArchive: ./backend/.serverless/forward-logs.zip # generated by serverless
  aws:lambda:Function (lambda_forward_logs):
    error: 1 error occurred:
    	* updating urn:pulumi:staging::*********::aws:lambda/function:Function::lambda_forward_logs: 1 error occurred:
    	* updating Lambda Function (lambda_forward_logs-51c8ab4) code: operation error Lambda: UpdateFunctionCode, https response error StatusCode: 400, RequestID: 142db5f7-abd7-446a-b320-97202f102e20, InvalidParameterValueException: Uploaded file must be a non-empty zip

raymzag avatar Mar 14 '24 00:03 raymzag

We are also seeing this issue on creating and updating Lambda function code.

As a test, I created an aws.s3.BucketObjectv2 component with the same asset. And the uploaded file is indeed empty (22 bytes).

The issue seems to go away if I restart my machine. But does eventually come back.

fwang avatar Mar 18 '24 19:03 fwang

The zip files that get uploaded are stored in ${tmp}/pulumi-asset-${hash}, where the hash is based on the file contents of what should be in the zip, but sometimes these files end up being empty (22B file, running unzip -l ${file} outputs zipfile is empty).

julienp avatar Mar 18 '24 20:03 julienp

The AssetArchive to upload gets stored as zip file in ${TEMP}/pulumi-asset-${HASH}, with ${HASH} based on the contents of the archive. When we run pulumi update --refresh and the zip files are not present yet, they get created in ${TEMP} but are empty. Presumably the name is derived from the existing state. If we then run a pulumi update that needs to do changes, it will fail with the empty zip error. Since deployments run with an empty temp folder, it reproduces all the time there, whereas locally you might have the correctly formed zip from the previous update.

triage-empty-zip main% ls -ahl /var/folders/lv/czktcqg963q26_khyypypmkm0000gn/T/pulumi-asset-*
zsh: no matches found: /var/folders/lv/czktcqg963q26_khyypypmkm0000gn/T/pulumi-asset-*
triage-empty-zip main% pulumi update --refresh --stack dev
Previewing update (dev)

View in Browser (Ctrl+O): https://app.pulumi.com/v-julien-pulumi-corp/triage-pulumi-empty-zip/dev/previews/5600a2f8-7845-477b-9348-be5846e9f152

     Type                    Name                         Plan       Info
     pulumi:pulumi:Stack     triage-pulumi-empty-zip-dev
     ├─ aws:iam:Role         iam_for_lambda
 ~   └─ aws:lambda:Function  fn                           update     [diff: ~code]

Resources:
    ~ 1 to update
    2 unchanged

Do you want to perform this update? no
confirmation declined, not proceeding with the update
triage-empty-zip main% ls -ahl /var/folders/lv/czktcqg963q26_khyypypmkm0000gn/T/pulumi-asset-*
-rw-------  1 julien  staff    22B Mar 18 21:19 /var/folders/lv/czktcqg963q26_khyypypmkm0000gn/T/pulumi-asset-4271ac265fd6fbf363faa1a179cec1a6174963b9bc656f4fb989f3b502ecaf2a
-rw-------  1 julien  staff    22B Mar 18 21:19 /var/folders/lv/czktcqg963q26_khyypypmkm0000gn/T/pulumi-asset-92b616411729aeb8d8bbb0c1d5e5c9b305ebffa1e0563a2286eefc658220c23e

Those 22B files are empty zip files.

If we run without --refresh, the zip gets created correctly

triage-empty-zip main% rm /var/folders/lv/czktcqg963q26_khyypypmkm0000gn/T/pulumi-asset-*
triage-empty-zip main% ls -ahl /var/folders/lv/czktcqg963q26_khyypypmkm0000gn/T/pulumi-asset-*
zsh: no matches found: /var/folders/lv/czktcqg963q26_khyypypmkm0000gn/T/pulumi-asset-*
triage-empty-zip main% pulumi update --stack dev
Previewing update (dev)

View in Browser (Ctrl+O): https://app.pulumi.com/v-julien-pulumi-corp/triage-pulumi-empty-zip/dev/previews/0c1d95bc-27f1-4d90-907b-add516e73a52

     Type                    Name                         Plan       Info
     pulumi:pulumi:Stack     triage-pulumi-empty-zip-dev
 ~   └─ aws:lambda:Function  fn                           update     [diff: ~code]

Resources:
    ~ 1 to update
    2 unchanged

Do you want to perform this update? no
confirmation declined, not proceeding with the update
triage-empty-zip main% ls -ahl /var/folders/lv/czktcqg963q26_khyypypmkm0000gn/T/pulumi-asset-*
-rw-------  1 julien  staff   201B Mar 18 21:19 /var/folders/lv/czktcqg963q26_khyypypmkm0000gn/T/pulumi-asset-92b616411729aeb8d8bbb0c1d5e5c9b305ebffa1e0563a2286eefc658220c23e

Note the fileize of 201B this time around.

julienp avatar Mar 18 '24 20:03 julienp

Yes!

Can confirm refresh was what caused it for us.

Can also confirm running refresh reliably reproduces the issue.

fwang avatar Mar 18 '24 20:03 fwang

Local reproduction:

  1. Deploy a lambda (note that the newly created ${TEMP}/pulumi-asset-${HASH} file is larger than 22B)
  2. Delete ${TEMP}/pulumi-asset-${HASH}
  3. Edit the code of the lambda
  4. pulumi up --refresh --stack dev (this will re-create the file for the previous hash, but as an empty zip file)
  5. Revert the code edit (code needs to be exactly as in step 1, so the hash matches)
  6. pulumi up --stack dev

The last step will attempt to upload the empty zip file.

julienp avatar Mar 18 '24 21:03 julienp

I haven't been able to repro on the original program from @komalali as it does not create ${TMPDIR}/pulumi-asset-${HASH} files but instead stores literal JS code in the Pulumi state file. So there might be several issues here.

After some experimentation I was able to reproduce based on this program:

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";
import * as apigateway from "@pulumi/aws-apigateway";

const cfg = new pulumi.Config();

const archive = cfg.require("archive");

const assumeRole = aws.iam.getPolicyDocument({
    statements: [{
        effect: "Allow",
        principals: [{
            type: "Service",
            identifiers: ["lambda.amazonaws.com"],
        }],
        actions: ["sts:AssumeRole"],
    }],
});

const iamForLambda = new aws.iam.Role("iam_for_lambda", {
    name: "iam_for_lambda",
    assumeRolePolicy: assumeRole.then(assumeRole => assumeRole.json),
});

const fn = new aws.lambda.Function("fn", {
    role: iamForLambda.arn,
    code: new pulumi.asset.FileArchive(archive),
    handler: "index.test",
    runtime: "nodejs18.x",
})

export const lambdaARN = fn.arn;

Calling refresh is not necessary for the repro. It looks like pulumi up clobbers the temp files for assets referenced from state but not user program.

#!/usr/bin/env bash

set -euo pipefail

export AWS_PROFILE=devsandbox

# Clean slate.
pulumi destroy --yes
rm -rf $TMPDIR/*pulumi*asset*

# Provision a1.zip
pulumi config set archive ./a1/a1.zip
pulumi up --skip-preview --yes
pulumi stack export --file a1.json
md5 $TMPDIR/*pulumi*asset*

# Provision a2.zip
pulumi config set archive ./a2/a2.zip
pulumi up --skip-preview --yes
md5 $TMPDIR/*pulumi*asset*

# Create a discontinuity between cloud and local state.
pulumi stack import --file a1.json

# Lose the temp asset files.
rm $TMPDIR/*pulumi*asset*

# Refresh is not necessary.
# pulumi refresh --yes

# See if we can update again.
pulumi up --yes --skip-preview

# This clobbers the asset for a1.zip.
md5 $TMPDIR/*pulumi*asset*

pulumi config set archive ./a1/a1.zip
pulumi up --yes --skip-preview # FAILS HERE

t0yv0 avatar Mar 19 '24 18:03 t0yv0

| 5.43.0 | no repro   |
|  6.2.0 | no repro   |
|  6.5.0 | no repro   |
|  6.6.0 | no repro   |
|  6.6.1 | reproduces |
|  6.7.0 | reproduces |
| 6.10.0 | reproduces |

t0yv0 avatar Mar 19 '24 18:03 t0yv0

I have confirmed that https://github.com/pulumi/pulumi/pull/14007 caused the change in behavior by interacting in an unexpected way with the asset-handling code in the bridge. Building 6.6.1 of the provider against v3.90.1 of pulumi/pkg with 14007 reverted yields a pulumi-resource-aws provider where the problem no longer reproduces.

t0yv0 avatar Mar 19 '24 20:03 t0yv0

I have passed on https://github.com/pulumi/pulumi/issues/15729 to the core team, the evidence is that it's likely once we have that fixed we can remediate the issue here. Thanks for your patience everyone.

t0yv0 avatar Mar 19 '24 20:03 t0yv0

Still awaiting a pulumi release (v3.111.1 does not contain the fix yet).

t0yv0 avatar Mar 25 '24 17:03 t0yv0

Looks like it was included in the latest release

ryanpodonnell1 avatar Mar 28 '24 14:03 ryanpodonnell1