Bug: SAM Deploy freezes with no output
Description:
When running sam deploy (lambda) on a specific deployment, I get a small amount of text output, but then nothing, I can leave the deploy running for a few hours and nothing occurs. I check for a new/updated stack and there's nothing present.
Steps to reproduce:
sam deploy --debug --template-file /home/octopus/Work/20250217122150-5136753-2697/xdm-xxx-xxx/package.yml --stack-name codedeploylambda-development-vdm-xdm-xxx-xxx --capabilities CAPABILITY_IAM --tags businessline=xxxx-device-manager vzc:cicd:octopus:projectname=xdm-xxx-xxx version=0.0.5-cicd.5 vzc:cicd:octopus:releasedate=2025-02-17t12:22:10.000z name=xdm-xxx-xxx vzc:component:name=vdm cft=xx vzc:component:owner=xx vzc:architecture:vastid=26064 product=platform vzc:cicd:octopus:releaseversion=0.0.5-cicd.5 octopusproject=xdm-xxx-xxx environment=development application=vdm vzc:architecture:product=platform vzc:cicd:octopus:deployedby=bamboo --no-progressbar --region eu-west-1 --parameter-overrides 'Parameters="{\\\"env_name\\\":\\\"development\\\",\\\"region\\\":\\\"eu-west-1\\\",\\\"env\\\":\\\"development\\\",\\\"account_number\\\":\\\"0000000000000\\\"}" Layer2="arn:aws:lambda:eu-west-1:00000000000:layer:Powershell_7_2_13_LambdaRuntime:1" KinesisBisectBatchOnError="false" KinesisMaxRetryAttempt="5" TestRoleArn="" EventSourceParallelizationFactor="2" FunctionName="development-vdm-xdm-xxx-xxx" KinesisEventBatchSize="5000" SubnetIds="subnet-0000000,subnet-00000000,subnet-000000000" SecurityGroupIds="sg-000000000" Timeout="900" FunctionRoleArn="arn:aws:iam::000000000000:role/security-lambda-xxxxxx-platform-development" KinesisEventMaximumBatchingWindow="2" Layer1="arn:aws:lambda:eu-west-1:00000000:layer:xdm-xxx-xxx:2" Handler="handler.ps1::handler" FunctionNameShort="development-vdm-xdm-ingre" DeploymentType="Linear10PercentEvery1Minute" Runtime="provided.al2"'
Observed result:
2025-02-17 12:22:29,301 | Using SAM Template at /home/octopus/Work/20250217122150-5136753-2697/xdm-ingress-lambda/package.yml
2025-02-17 12:22:29,302 | No config file found in this directory.
2025-02-17 12:22:29,302 | OSError occurred while reading TOML file: [Errno 2] No such file or directory: '/home/octopus/Work/20250217122150-5136753-2697/xdm-ingress-lambda/samconfig.toml'
2025-02-17 12:22:29,302 | Config file location: /home/octopus/Work/20250217122150-5136753-2697/xdm-ingress-lambda/samconfig.toml
2025-02-17 12:22:29,302 | Config file '/home/octopus/Work/20250217122150-5136753-2697/xdm-ingress-lambda/samconfig.toml' does not exist
<no more output>
Expected result:
I have other deploys with the debug flag that deploys as expected, with similar output, yet they don't freeze
SAM CLI update available (1.133.0); (1.132.0 installed)
To download: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html
2025-02-17 12:56:59,204 | Using SAM Template at /home/octopus/Work/20250217125630-5136857-2919/reveal.externaltoken.api/package.yml
2025-02-17 12:56:59,205 | No config file found in this directory.
2025-02-17 12:56:59,205 | OSError occurred while reading TOML file: [Errno 2] No such file or directory: '/home/octopus/Work/20250217125630-5136857-2919/reveal.externaltoken.api/samconfig.toml'
2025-02-17 12:56:59,205 | Config file location: /home/octopus/Work/20250217125630-5136857-2919/reveal.externaltoken.api/samconfig.toml
2025-02-17 12:56:59,205 | Config file '/home/octopus/Work/20250217125630-5136857-2919/reveal.externaltoken.api/samconfig.toml' does not exist
2025-02-17 12:56:59,243 | OSError occurred while reading TOML file: [Errno 2] No such file or directory: '/home/octopus/Work/20250217125630-5136857-2919/reveal.externaltoken.api/samconfig.toml'
2025-02-17 12:56:59,267 | Using config file: samconfig.toml, config environment: default
2025-02-17 12:56:59,267 | Expand command line arguments to:
<snip>
Additional environment details (Ex: Windows, Mac, Amazon Linux etc)
- OS: Amazon Linux
sam --version: 1.132.0- AWS region: eu-west-1
Hi @vaujo6y. Sorry for the delay. Can you provide more information on that "specific deployment".
- What is different between where it works and where it doesn't?
- Is it the same template but with different parameters/region/tags/values? Or just different templates? Did you try to reproduce in a different region for example?
- What types resources are you creating in your template? (Lambda functions only? Other infrastructure?)
- Does anything get created in CloudFormation? Maybe a ChangeSet if this is an existing stack at least? (Or an empty stack if it's a brand new stack)
- Is this in your own development environment, or is it part of a pipeline or CI/CD workflow?
At least we're not aware of a widespread issue related to sam deploy, so we need to understand more about your particular situation. An initial thing you can try is to update to the latest version of SAM CLI (1.134.0), which has some dependency upgrades that may help with your issue.
- The difference is nothing that we know of. We thought that the first deploy to an environment will always work and the subsequent ones would always fail, but tests just now we cleaned the old stack out and deployed the same release and it "froze". So we don't see a pattern.
- It's always the same template, but with varying results on whether it deploys or not.
- Lambda (powershell on custom aml2 runtime)
- When we get a successful deployment, we can see the stack and the resources. It feels like a flip of a coin as to whether it deploys or not.
- CI/CD Workflow.
I will try 1.134.0 and to deploy outside of our CI/CD workflows to see if that makes any difference.
An extra clarification on 4.
When it freezes.. do you see anything happening with the stack in CloudFormation? Are there any "Events" visible in the stack? (easier to see in the CloudFormation console). Does CFN even see that a new deployment is supposed to happen?
It sounds to me like it could be a problem at the CloudFormation level, so it would be useful to know if the deployment actually reaches CloudFormation at all, or if the freeze happens before getting to CloudFormation (in that latter case, it would be more likely to be an issue with SAM CLI)