aws-nuke icon indicating copy to clipboard operation
aws-nuke copied to clipboard

disable cloudfront distribution before delete it

Open wdalmut opened this issue 3 years ago • 11 comments

This PR closes #571

It updates the cloudfront distribution status and waits until the distribution is in the "deployed" status

wdalmut avatar Nov 19 '20 19:11 wdalmut

ok,

i try to return back an error if the cloudfront distribution is in state "update in progress".

wdalmut avatar Nov 29 '20 07:11 wdalmut

@wdalmut Do you think you can implement the change?

der-eismann avatar Feb 05 '21 11:02 der-eismann

@wdalmut Do you think you can implement the change?

yes i want to. i use this project and i like it very much, in addition i really and need that fix (actually i have built my own release).

I just strugle to understand how to implement it because i don't have much time in this months...

wdalmut avatar Feb 06 '21 07:02 wdalmut

@svenwltr I just gave it a try, the problem is that disabling the distribution takes a while and when it is the only resource left it fails:

Do you really want to nuke these resources on the account with the ID xxxxxxxx and the alias 'test'?
Do you want to continue? Enter account alias to continue.
> test

global - CloudFrontDistributionDeployment - E2SCK7OH4FPF3J - triggered remove
global - CloudFrontDistribution - E2SCK7OH4FPF3J - failed

Removal requested: 1 waiting, 1 failed, 114 skipped, 0 finished

global - CloudFrontDistributionDeployment - E2SCK7OH4FPF3J - waiting
global - CloudFrontDistribution - E2SCK7OH4FPF3J - failed

Removal requested: 1 waiting, 1 failed, 114 skipped, 0 finished

(repeats 23 times)

global - CloudFrontDistributionDeployment - E2SCK7OH4FPF3J - removed
global - CloudFrontDistribution - E2SCK7OH4FPF3J - failed

Removal requested: 0 waiting, 1 failed, 114 skipped, 1 finished

global - CloudFrontDistribution - E2SCK7OH4FPF3J - failed

Removal requested: 0 waiting, 1 failed, 114 skipped, 1 finished

global - CloudFrontDistribution - E2SCK7OH4FPF3J - failed

Removal requested: 0 waiting, 1 failed, 114 skipped, 1 finished

ERRO[0185] There are resources in failed state, but none are ready for deletion, anymore. 

global - CloudFrontDistribution - E2SCK7OH4FPF3J - failed
ERRO[0185] CloudFront Distribution E2SCK7OH4FPF3J update is still In Progress 
Error: failed

I'm not really sure how to fix this, apart from using the blocking call. A waiting state would be more consistent than failed, but I don't see a way to do that without changing the queue code.

der-eismann avatar Feb 08 '21 11:02 der-eismann

Some ideas:

  • Split resources, so we have one resource for the state and one for the actual distribution.
  • Somehow introduce an error type that indicates that this resource needs to be retried.

Having a block call means that aws-nuke does nothing while waiting, which would slow down the whole nuke operation. This might not be bad for a single resource, but when we start with it here, than others will follow. We should be consistent with the workflow.

svenwltr avatar Feb 08 '21 11:02 svenwltr

  • Split resources, so we have one resource for the state and one for the actual distribution.

This is actually what we do already. We have a separate CloudFrontDistributionDeployment resource that disables the distribution. Actually I just nuked twice with the current code and it succeeded:

global - CloudFrontDistribution - E2OJCF6O53KIP0 - failed
global - CloudFrontDistributionDeployment - E2OJCF6O53KIP0 - waiting

Removal requested: 1 waiting, 1 failed, 114 skipped, 0 finished

(repeats 26x)

global - CloudFrontDistribution - E2OJCF6O53KIP0 - removed
global - CloudFrontDistributionDeployment - E2OJCF6O53KIP0 - removed

Removal requested: 0 waiting, 0 failed, 114 skipped, 2 finished

My guess is that the users filter only for the CloudFrontDistribution resource and that's why it fails. To be fair, it is a bit confusing.

@wdalmut Can you post your nuke config?

der-eismann avatar Feb 08 '21 12:02 der-eismann

@der-eismann here my example configuration:

regions:
- eu-west-1
- eu-central-1
- us-east-1
- global

account-blacklist:
- "999999999999" # production

accounts:
  "123456789012":
    filters:
      IAMSAMLProvider:
      - arn:aws:iam::123456789012:saml-provider/AWSSSO_123456789012_DO_NOT_DELETE
      IAMRole:
      - AWSReservedSSO_AdministratorAccess_deadbeef
      - OurAdminRole
      - AWSCloudFormationStackSetExecutionRole
      Route53HostedZone:
      - /hostedzone/SOMETHINGROUTE53 (domain.name.tld.)
      IAMRolePolicy:
      - OurAdminRole -> AdministratorAccess
      IAMRolePolicyAttachment:
      - AWSReservedSSO_AdministratorAccess_deadbeef -> AdministratorAccess
      - AWSServiceRoleForSSO -> AWSSSOServiceRolePolicy
      - AWSCloudFormationStackSetExecutionRole -> AdministratorAccess
      CloudFormationStack:
      - dns
      - stack-set

We currently use this tool to drop everything in the account after our tests.

wdalmut avatar Feb 08 '21 18:02 wdalmut

Okay, it's only now that I see that you have a completely different problem, when trying to disabling the distribution you got an

IllegalUpdate: You cannot update the specified distribution using this API version because it is associated with a cache policy.

To be honest I can't reproduce this. I created a CF distribution with a custom cache policy, it's nuking fine. Does your distribution use any special features? Can you try the latest main code? Maybe it was resolved with an SDK update.

der-eismann avatar Feb 09 '21 08:02 der-eismann

@der-eismann of course i can i will try to update with your latest build and create a new cloudfront distribution on top of a bucket and another custom origin and test the account nuke. I try to test this in these days. i let you know

wdalmut avatar Feb 09 '21 20:02 wdalmut

I tried with the latest version but i still get the error

global - CloudFrontDistribution - CFDISTRIBUTIONID - failed
global - CloudFrontDistributionDeployment - CFDISTRIBUTIONID - failed

Removal requested: 0 waiting, 2 failed, 143 skipped, 9 finished

global - CloudFrontDistribution - CFDISTRIBUTIONID - failed
global - CloudFrontDistributionDeployment - CFDISTRIBUTIONID - failed

Removal requested: 0 waiting, 2 failed, 143 skipped, 9 finished

ERRO[0064] There are resources in failed state, but none are ready for deletion, anymore. 

global - CloudFrontDistribution - CFDISTRIBUTIONID - failed
ERRO[0064] DistributionNotDisabled: The distribution you are trying to delete has not been disabled.
        status code: 409, request id: 57acde7f-9af5-47ea-9ef7-504a6f7ba87b 
global - CloudFrontDistributionDeployment - CFDISTRIBUTIONID - failed
ERRO[0064] IllegalUpdate: You cannot update the specified distribution using this API version because it is associated with a cache policy.
        status code: 400, request id: 97ab830e-f054-409e-b18d-bb6381ce229f 
Error: failed

i just created an s3 bucket and a CloudFront distribution. The CF Distribution is composed by also a custom origin for example www.example.org and a custom behavior that use path routing to the custom origin.

The CF distribution is never updated to disabled and the tool goes in error quite immediately

wdalmut avatar Feb 10 '21 20:02 wdalmut

Another thing that would help here is to also remove any Lambda@Edge behaviors, and possibly any CloudFormation functions (which are different from Lambdas).

sysfsss avatar Jun 04 '21 01:06 sysfsss