bitops icon indicating copy to clipboard operation
bitops copied to clipboard

Cloudformation hanging during deployment stage

Open PhillypHenning opened this issue 3 years ago • 6 comments

I've noticed a bit of an ongoing issue with the cloudformation deployment.

until echo "$STATUS" | egrep -q 'CREATE_COMPLETE|UPDATE_COMPLETE|COMPLETE|FAILED|DELETE_IN_PROGRESS'; 
do 
  # DEPLOYMENT STAGE 1
  aws cloudformation describe-stack-events --stack-name "${CFN_STACK_NAME}" --query 'StackEvents[?contains(ResourceStatus,`CREATE_IN_PROGRESS`)].[LogicalResourceId, ResourceStatus, ResourceType, ResourceStatusReason]';

  # DEPLOYMENT STAGE 2
  aws cloudformation describe-stack-events --stack-name "${CFN_STACK_NAME}" --query 'StackEvents[?contains(ResourceStatus,`FAILED`)].[LogicalResourceId, ResourceStatus, ResourceType, ResourceStatusReason]';

  sleep 10; 
  # DEPLOYMENT STAGE 3
  STATUS=$(aws cloudformation describe-stacks --stack-name "${CFN_STACK_NAME}" --query "Stacks[0].StackStatus" --output text);

done

It seems to me that during STAGE 1, the query can become hung and none responsive.. I've noticed this occurs if the stack is in any state that isn't CREATE_IN_PROCESS

PhillypHenning avatar Jan 10 '22 14:01 PhillypHenning

A few things that should be noted.

  1. The AWS_DEFAULT_REGION needs to be the same as the region that the stack-name exists in.
  2. The logs are being queried at the "most recent" state. This means we are checking if the logs for the stack event (in this case CREATE_IN_PROGRESS exists in the stack events, not that they are the most recent event.

PhillypHenning avatar Jan 10 '22 14:01 PhillypHenning

This will be fixed with the multi-regional deployment change

PhillypHenning avatar Jan 13 '22 17:01 PhillypHenning

Snippet of logging output;

STATUS: [CREATE_IN_PROGRESS]
{
    "StackEvents": [
        {
            "StackId": "arn:aws:cloudformation:us-east-1:186513196687:stack/test-mr-clearwater-ecs-infra/a08ce2a0-74ab-11ec-a9db-12e3f0953fd1",
            "EventId": "SecurityGroups-CREATE_IN_PROGRESS-2022-01-13T20:01:50.487Z",
            "StackName": "test-mr-clearwater-ecs-infra",
            "LogicalResourceId": "SecurityGroups",
            "PhysicalResourceId": "arn:aws:cloudformation:us-east-1:186513196687:stack/test-mr-clearwater-ecs-infra-SecurityGroups-XRU6YSRBSI9/a5255680-74ab-11ec-96fc-0eb6db23da45",
            "ResourceType": "AWS::CloudFormation::Stack",
            "Timestamp": "2022-01-13T20:01:50.487000+00:00",
            "ResourceStatus": "CREATE_IN_PROGRESS",
            "ResourceStatusReason": "Resource creation Initiated",
            "ResourceProperties": "{\"TemplateURL\":\"https://s3.amazonaws.com/clearwater-bitops-deployments/multiregion-deployment/templates/security-groups.yaml\",\"Parameters\":{\"VpcId\":\"vpc-0bc30f5a3a12d11e0\",\"SourceSecurityGroup\":\"sg-0edf34e8738a4688c\",\"Region\":\"us-east-1\"}}"
        },
        {
            "StackId": "arn:aws:cloudformation:us-east-1:186513196687:stack/test-mr-clearwater-ecs-infra/a08ce2a0-74ab-11ec-a9db-12e3f0953fd1",
            "EventId": "SecurityGroups-CREATE_IN_PROGRESS-2022-01-13T20:01:49.433Z",
            "StackName": "test-mr-clearwater-ecs-infra",
            "LogicalResourceId": "SecurityGroups",
            "PhysicalResourceId": "",
            "ResourceType": "AWS::CloudFormation::Stack",
            "Timestamp": "2022-01-13T20:01:49.433000+00:00",
            "ResourceStatus": "CREATE_IN_PROGRESS",
            "ResourceProperties": "{\"TemplateURL\":\"https://s3.amazonaws.com/clearwater-bitops-deployments/multiregion-deployment/templates/security-groups.yaml\",\"Parameters\":{\"VpcId\":\"vpc-0bc30f5a3a12d11e0\",\"SourceSecurityGroup\":\"sg-0edf34e8738a4688c\",\"Region\":\"us-east-1\"}}"
        },
        {
            "StackId": "arn:aws:cloudformation:us-east-1:186513196687:stack/test-mr-clearwater-ecs-infra/a08ce2a0-74ab-11ec-a9db-12e3f0953fd1",
            "EventId": "a08f7ab0-74ab-11ec-a9db-12e3f0953fd1",
            "StackName": "test-mr-clearwater-ecs-infra",
            "LogicalResourceId": "test-mr-clearwater-ecs-infra",
            "PhysicalResourceId": "arn:aws:cloudformation:us-east-1:186513196687:stack/test-mr-clearwater-ecs-infra/a08ce2a0-74ab-11ec-a9db-12e3f0953fd1",
            "ResourceType": "AWS::CloudFormation::Stack",
            "Timestamp": "2022-01-13T20:01:42.494000+00:00",
            "ResourceStatus": "CREATE_IN_PROGRESS",
            "ResourceStatusReason": "User Initiated"
        }
    ]
}
STATUS: [CREATE_IN_PROGRESS]
{
    "StackEvents": [
        {
            "StackId": "arn:aws:cloudformation:us-east-1:186513196687:stack/test-mr-clearwater-ecs-infra/a08ce2a0-74ab-11ec-a9db-12e3f0953fd1",
            "EventId": "SecurityGroups-CREATE_IN_PROGRESS-2022-01-13T20:01:50.487Z",
            "StackName": "test-mr-clearwater-ecs-infra",
            "LogicalResourceId": "SecurityGroups",
            "PhysicalResourceId": "arn:aws:cloudformation:us-east-1:186513196687:stack/test-mr-clearwater-ecs-infra-SecurityGroups-XRU6YSRBSI9/a5255680-74ab-11ec-96fc-0eb6db23da45",
            "ResourceType": "AWS::CloudFormation::Stack",
            "Timestamp": "2022-01-13T20:01:50.487000+00:00",
            "ResourceStatus": "CREATE_IN_PROGRESS",
            "ResourceStatusReason": "Resource creation Initiated",
            "ResourceProperties": "{\"TemplateURL\":\"https://s3.amazonaws.com/clearwater-bitops-deployments/multiregion-deployment/templates/security-groups.yaml\",\"Parameters\":{\"VpcId\":\"vpc-0bc30f5a3a12d11e0\",\"SourceSecurityGroup\":\"sg-0edf34e8738a4688c\",\"Region\":\"us-east-1\"}}"
        },
        {
            "StackId": "arn:aws:cloudformation:us-east-1:186513196687:stack/test-mr-clearwater-ecs-infra/a08ce2a0-74ab-11ec-a9db-12e3f0953fd1",
            "EventId": "SecurityGroups-CREATE_IN_PROGRESS-2022-01-13T20:01:49.433Z",
            "StackName": "test-mr-clearwater-ecs-infra",
            "LogicalResourceId": "SecurityGroups",
            "PhysicalResourceId": "",
            "ResourceType": "AWS::CloudFormation::Stack",
            "Timestamp": "2022-01-13T20:01:49.433000+00:00",
            "ResourceStatus": "CREATE_IN_PROGRESS",
            "ResourceProperties": "{\"TemplateURL\":\"https://s3.amazonaws.com/clearwater-bitops-deployments/multiregion-deployment/templates/security-groups.yaml\",\"Parameters\":{\"VpcId\":\"vpc-0bc30f5a3a12d11e0\",\"SourceSecurityGroup\":\"sg-0edf34e8738a4688c\",\"Region\":\"us-east-1\"}}"
        },
        {
            "StackId": "arn:aws:cloudformation:us-east-1:186513196687:stack/test-mr-clearwater-ecs-infra/a08ce2a0-74ab-11ec-a9db-12e3f0953fd1",
            "EventId": "a08f7ab0-74ab-11ec-a9db-12e3f0953fd1",
            "StackName": "test-mr-clearwater-ecs-infra",
            "LogicalResourceId": "test-mr-clearwater-ecs-infra",
            "PhysicalResourceId": "arn:aws:cloudformation:us-east-1:186513196687:stack/test-mr-clearwater-ecs-infra/a08ce2a0-74ab-11ec-a9db-12e3f0953fd1",
            "ResourceType": "AWS::CloudFormation::Stack",
            "Timestamp": "2022-01-13T20:01:42.494000+00:00",
            "ResourceStatus": "CREATE_IN_PROGRESS",
            "ResourceStatusReason": "User Initiated"
        }
    ]
}

PhillypHenning avatar Jan 13 '22 20:01 PhillypHenning

@mickmcgrath13 / @ConnorGraham What are your thoughts on having a verbose logging flag for cloudformation.

verbose would look like the above, and non verbose would strip it to be;

STATUS: [CREATE_IN_PROGRESS]
STATUS: [CREATE_IN_PROGRESS]
STATUS: [CREATE_COMPLETED]

PhillypHenning avatar Jan 13 '22 20:01 PhillypHenning

@PhillypHenning there are already checks through bitops for the DEBUG env var. Would this suffice or do you want something more specific to CF?

ConnorGraham avatar Jan 13 '22 20:01 ConnorGraham

I was thinking this would be specifically for CF logs, though if you think it should be encompassed in the DEBUG flag I don't have any objections.

PhillypHenning avatar Jan 14 '22 14:01 PhillypHenning