aws-nuke icon indicating copy to clipboard operation
aws-nuke copied to clipboard

CloudWatchLogsLogGroup - stuck waiting

Open mbergal-idx opened this issue 4 years ago • 18 comments

/aws/eks/idx-eks-pulumi-test-eksCluster-6586a80/cluster - waiting Removal requested: 1 waiting, 0 failed, 74 skipped, 95 finished 
us-east-2 - CloudWatchLogsLogGroup - /aws/eks/idx-eks-pulumi-test-eksCluster-6586a80/cluster - waiting Removal requested: 1 waiting, 0 failed, 74 skipped, 95 finished

Not sure what to add, but I able to delete this log group manually

mbergal-idx avatar Apr 30 '20 01:04 mbergal-idx

Maybe because when request to delete it is made, EKS cluster is still writing to it?

mbergal-idx avatar Apr 30 '20 03:04 mbergal-idx

@mbergal-idx which release are you running?

I've just got this problem with the v2.14.0. With v2.14.0 the removal went fine.

ap-southeast-2 - CloudWatchLogsLogGroup - /aws/vpc/mgmt-VPC/flow-logs - waiting

Removal requested: 1 waiting, 0 failed, 68 skipped, 0 finished

ap-southeast-2 - CloudWatchLogsLogGroup - /aws/vpc/mgmt-VPC/flow-logs - removed

Removal requested: 0 waiting, 0 failed, 68 skipped, 1 finished`

rbbrasil avatar May 05 '20 20:05 rbbrasil

@mbergal-idx Your above comment contains the same version.. I am also using the version v2.14.0 and having the same problem with eks cluster deletion and cloudwatch log group deletion..

rajivchirania avatar May 12 '20 14:05 rajivchirania

Oops! That was a typo. Sorry.

The Docker image version that worked for me was 2.12.0.

rbbrasil avatar May 12 '20 16:05 rbbrasil

I am using 2.14.0, deleting single log group works fine, but if log group is associated with the cluster it does not get deleted for some reason.

mbergal-idx avatar May 12 '20 18:05 mbergal-idx

Hello.

Sorry for the late response. Can you give us a hint to to reproduce this error?

svenwltr avatar May 25 '20 13:05 svenwltr

This happens if this is EKS cluster's log group. I think it gets deleted but cluster recreates it since it takes more time for it to get deleted. If this is not enough I might be able to create a simple pulumi script to make a repro.

mbergal-idx avatar May 27 '20 05:05 mbergal-idx

I have the same issue EKS cluster create a cloudwatch log group if I delete the log group by hand it deletes just fine. v2.13.0 I will try to upgrade see if it helps.

rickepnet avatar Jun 17 '20 00:06 rickepnet

👍🏻 on this issue. Restarting the tool against the account in question, the log group is successfully deleted no problems.

This is a bit of an issue when using automation as the tool will simply recycle the deletion process and continue to fail. Not sure if I can set a retry count and exit with error or not but that would be preferable.

v2.14.0 is the version I am using, also in ap-southeast-2 region.

jbarnes avatar Feb 07 '21 23:02 jbarnes

We're also seeing this issue, and it's also caused by EKS log groups.

Reproduction steps should be:

  1. Create any cluster with eksctl tool and make sure to enable control plane logging to cloudwatch
  2. Run aws-nuke on the account

Probably the important part is that cluster still exists when aws-nuke is run, and is also deleted in the process. Log groups from previously deleted clusters do not cause this issue.

artem-nefedov avatar Mar 11 '21 08:03 artem-nefedov

I think it gets deleted but cluster recreates it since it takes more time for it to get deleted.

I follow this idea. Unfortunately I do not see a obvious solution to this.

svenwltr avatar Mar 26 '21 12:03 svenwltr

I have the same situation with the 2.15 version. CW Log Group is in a stuck state. Logs:

13:35:15  us-west-2 - CloudWatchLogsLogGroup - /aws/eks/{ cluster_name }/cluster - waiting
13:35:15  
13:35:15  Removal requested: 1 waiting, 0 failed, 147 skipped, 64 finished
13:35:15  
13:35:20  us-west-2 - CloudWatchLogsLogGroup - /aws/eks/{ cluster_name }/cluster - waiting
13:35:20  
13:35:20  Removal requested: 1 waiting, 0 failed, 147 skipped, 64 finished
13:35:20  
13:35:25  us-west-2 - CloudWatchLogsLogGroup - /aws/eks/{ cluster_name }/cluster - waiting
13:35:25  
13:35:25  Removal requested: 1 waiting, 0 failed, 147 skipped, 64 finished

ivan-sukhomlyn avatar Jul 07 '21 10:07 ivan-sukhomlyn

Still seeing aws-nuke (v 2.17) indefinitely hanging when deleting cloudwatch log groups. If I cancel the aws-nuke run and re-run, the log group deletes immediately without issue.

This is unrelated to EKS for me.

Recreate using boto3 to create log group:

mySession =  boto3.Session(
    aws_access_key_id=accessKeyId, aws_secret_access_key=secretAccessKey
)
logsClient = mySession.client('logs', region_name='eu-west-1')
log_group_response = logsClient.create_log_group(
    logGroupName='all-rejected-traffic'
)

in eu-west-1 in a completely wiped account. The log group is then targeted by a flow log on the default VPC. Then running aws-nuke gives

Removal requested: 1 waiting, 0 failed, 296 skipped, 21 finished

eu-west-1 - CloudWatchLogsLogGroup - all-rejected-traffic - waiting

Removal requested: 1 waiting, 0 failed, 296 skipped, 21 finished

eu-west-1 - CloudWatchLogsLogGroup - all-rejected-traffic - waiting

Removal requested: 1 waiting, 0 failed, 296 skipped, 21 finished

eu-west-1 - CloudWatchLogsLogGroup - all-rejected-traffic - waiting

It seems aws-nuke is getting stuck in some indefinite cycle?

wushingmushine avatar May 31 '22 14:05 wushingmushine

Infinitely waiting aws-nuke version 2.22.1

eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - failed
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/otomi/cluster - [CreatedTime: "168334--183", LastEvent: "2023-05-06T06:20:34+02:00", logGroupName: "/aws/eks/otomi/cluster", tag:created_by: "terragrunt", tag:k8s: "custom", tag:workspace: "testing"] - waiting
eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting

Removal requested: 2 waiting, 2 failed, 93 skipped, 181 finished

after re-run it terminates gracefully.

leiarenee avatar May 06 '23 04:05 leiarenee

Still failing with v2.22.1.15.ge45750a

eu-central-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - failed
eu-central-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting

Removal requested: 1 waiting, 1 failed, 557 skipped, 1387 finished

chronicc avatar Jun 05 '23 09:06 chronicc

Still same issue, using: quay.io/rebuy/aws-nuke:v2.23.0 I have two eks clusters, each have one CloudWatchLogsLogGroup. One of them always gets deleted, the other stays in infinite loop.

martivo avatar Jul 06 '23 12:07 martivo

I am facing the same issue also, my github workflow took 6 hour to delete the resources, It automatically got cancelled due to default timeout, it was stuck in the CloudWatchLogGroup Deletion.

Github action gives 3000 minutes free for free organization account and I'm getting the same error every time and I have to compromise with github time limits.

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "[email protected]", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "[email protected]", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "[email protected]", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "[email protected]", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "[email protected]", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "[email protected]", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "[email protected]", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

image

nileshgadgi avatar Sep 23 '23 19:09 nileshgadgi

Tagging along with the same issue here. Tearing down EKS clusters and their respective dependencies fails because it can't delete the Cloudwatch log group.

lucazz avatar Mar 01 '24 14:03 lucazz