`sam deploy` for deduped build checks presence of identical build artifact many times
Describe your idea/feature/enhancement
I sam deploy-ed 17 lambdas which shares CodeUri and Runtime (thus deduped build is enabled).
Got following output:
Uploading to my-stack-name/eb7b8e83[...snip...] 17287115 / 17287115 (100.00%)
File with same data already exists at my-stack-name/eb7b8e83[...snip...], skipping upload
[... another identical 15 lines ...]
This output implies sam deploy unnecessarily uploads identical artifact.
Proposal
Ignore identical upload.
It takes few seconds for each skipped upload.
Ignoring identical upload can improve performance of sam deploy.
Additional Details
@yskkin Not sure I follow what you are proposing here. The output states "skipping upload" so not sure what is indicating unnecessary uploads.
https://github.com/aws/aws-sam-cli/blob/bfbcc166709dba330310088e5a5e520da1ef2444/samcli/lib/package/s3_uploader.py#L87-L89
This file_exists check take few seconds for each lambda and slows down build if deployed deduped lambda > ~10.
Since we know build artifact is identical, I thought even file_exists check can be skipped.
Currently facing the same issue, SAM stack is taking sooooo long to get deployed, and I'm seeing a bunch of these messages being triggered.
Maybe these checks could be done in parallel or something similar?
So currently we check against the source (s3) to ensure the artifacts are indeed there. So as it is today, we only know the file already exists by calling s3.head_object (which shouldn't take much time at all) (https://github.com/aws/aws-sam-cli/blob/bfbcc166709dba330310088e5a5e520da1ef2444/samcli/lib/package/s3_uploader.py#L196).
We could probably optimize this further somehow by caching some additional info.
Maybe these checks could be done in parallel or something similar?
Parallel probably isn't the best idea, as you can run into race conditions in uploading and checking if the file exists.
Will update the labels. We can look at a PR in anyone is interested here but not something we are currently on our Roadmap.
If you are interested in submitting a fix, we will want to make sure however this is cached is reproducible and avoids not uploading something we should have.
Thanks for looking into this guys. This is a real blocker for us, as it's taking nearly 50 minutes to deploy the stack at times... Any way this can be prioritized?
We'll have to switch over to CDK otherwise I'm afraid, as the release times are just too long.
Thanks again
@konarskis If this is blocking you, you are free to submit a PR to patch.
I started looking into this issue because I was experiencing the same thing @konarskis was. It turns out it's not the head_object check that's taking so long – it's the fact that sam build puts each function into its own directory within .aws-sam/build, and then sam deploy zips up each of those directories in turn and then checks to see if the resulting file already exists in s3.
So for two functions, functionA and functionB, with the same CodeUri and Runtime, it has to:
- Zip up
.aws-sam/build/functionAand get the md5 of the result - Check to see if a file with that md5 has already been uploaded
- It hasn't, so upload it
- Zip up
.aws-sam/build/functionBand get the md5 of the result - Check to see if a file with that md5 has already been uploaded
- It has (because it's identical to
functionA), so skip it
It's Step 4 that's the time consuming one for each function, not Step 5.
The solution would seem to be a more optimized build process where the target directory is .aws-sam/[artifact_hash], where artifact_hash is the md5 hash of the function's CodeUri + Runtime. That would allow some deduping of the build as well as the upload. But it's a pretty major change to how things work, and I don't know the codebase well enough to predict what kind of knock-on effects it might produce.
@jfuss, if you think this approach is worth pursuing, I'd be willing to take a run at it, but I didn't want to forge ahead without some idea that such a refactor would be welcome/considered.
I guess a less invasive version of that would be to create a dict that maps each function's resource ID to a hash of their CodeUri + Runtime and use that to decide which ones to upload. Maybe I'll give that a go and submit a PR.
Hey all, just checking in to see if there's been any progress or workarounds for this. As our code grows, we've been experiencing 25min waits for these checks and deployment. Thank you very much!
I started working on the idea I posted about two comments up and never got anywhere with it. (I can't remember now if it was a technical issue or just me having to reprioritize.)
This dominates our deploy time (30 minutes+ at present), would be a big help to improve this time.
Looking at the code more, I think it actually works like:
- Zip up .aws-sam/build/functionA and get the md5 of the result
- Check to see if a file with that md5 has already been uploaded
- It hasn't, so upload it ~~4. Zip up .aws-sam/build/functionB and get the md5 of the result~~ 4a. Get MD5 of unzipped directory 4b. Zip up directory
- Check to see if a file with that md5 has already been uploaded
- It has (because it's identical to functionA), so skip it
Also, I think the MD5 is important as otherwise it will never upload changes to the function in the first place (i.e., just using codeuri and runtime = no changes uploaded ever).
If the zip is what's taking time, it should be possible to calculate the hash and check S3 before zipping the directory. (If the zip is taking time, there are faster hashes.)
(I would make this change but I am just an unfrozen caveman, your unit testing and pull requests frighten and confuse me.)
There is a hidden feature flag / environment variable SAM_CLI_BETA_BUILD_PERFORMANCE. It makes sam build create only one directory for all lambdas in the stack, which tremendously improves its runtime.
We can add a similar SAM_CLI_BETA_PACKAGE_PERFORMANCE flag that will check if the local code uri path was already uploaded previously in the command.
With both flags enabled, sam build will create only one folder, and then sam package will upload the folder for the first lambda resource, and then skip it for the rest since the folder was already uploaded.
Maybe these checks could be done in parallel or something similar?
Just to add on other comments - this is likely not going to help since the heavy operation (zip) is also disk intensive, which already "runs in parallel".
SAM_CLI_BETA_BUILD_PERFORMANCE has been great. I'd love to see the PACKAGE_PERFORMANCE feature as well, since checking the same already-uploaded resource a bunch of times is currently the longest part of my deploy process by far.
Just curious - does sam sync respect the BUILD_PERFORMANCE flag when it builds? Or does the AutoDependencyLayer stuff make that difficult?