1click-hpc
1click-hpc copied to clipboard
`pcluster create-cluster --wait` is exiting early
you can see this in bootstrap.log
+ /home/ec2-user/.local/bin/pcluster create-cluster --cluster-name hpc-1click-hpc365 --cluster-configuration config.us-east-1.yaml --rollback-on-failure false --wait
{
"message": "The security token included in the request is expired"
}
Seems like if the command goes beyond 10-15 minutes, the EC2/Cloud9 credentials are cycling and the pcluster CLI doesn't take this into account. The --wait option seems to be deprecated in pcluster so we probably need to move to a polling approach that allows the credentials to refresh.
This causes the outer CloudFormation stack to fail initially, but it will succeed if it is Retried.