aws-eda-slurm-cluster icon indicating copy to clipboard operation
aws-eda-slurm-cluster copied to clipboard

Running install.sh with -cdk-cmd update in rapid succession can damage the cluster

Open gwolski opened this issue 1 year ago • 0 comments

I ran a --cdk-cmd update to update Instance selections. Then I realized I wanted an additional change, so I modified my config file, and ran the update again. Unfortunately, this corrupted my cluster as the two commands were run too close in succession. The second command tried to do a rollback and that failed.

Can we put in some sort of check to ensure the CloudFormation is not "IN PROGRESS" before allowing install.sh to update?

To reproduce just change some instances in your config and then do it again in rapid order.

gwolski avatar Apr 12 '24 00:04 gwolski