aws-load-balancer-controller Listener and listener-rule AWS tagging records leak when deleting an ALB

We have a simple non-shared ALB per ingress system, and recently noticed when cleaning up an old cluster (with ALBC v2.3.0) that a number of minor AWS resources had leaked. E.g. if you run the below (with a suitable replacement for myclustername) you can see all the LB resources for a cluster created by this controller:

aws resourcegroupstaggingapi get-resources --tag-filters "Key=elbv2.k8s.aws/cluster,Values=myclustername" --query ResourceTagMappingList[].ResourceARN

If you then delete an ingress that originally created an ALBs (i.e. not just a TargetGroup via the CRD), the "loadbalancer" and "targetgroup" resources are deleted, but not the "listener" or "listener-rule" ones.

The controller pod logs show creation log lines for the listener/rule resource but no matching deletion log entry unlike for loadbalancer/targetgroup. No other errors visible apart from a "Reconciler error" that it can't find the load-balancer (that it deleted a few millisecs ago so might be expected!)

Environment

AWS Load Balancer controller version: 2.3.0 but I also saw this with a test cluster on 2.4.1
Kubernetes version: 1.20
Using EKS: yes, latest 1.20 platform version

Mar 29 '22 15:03 tyrken

Checked our environment after reading this and can confirm the same behavior

Environment:

AWS Load Balancer controller version: 2.4.1
Kubernetes version: 1.20
Using EKS: yes, latest 1.20 platform version eks.3

Mar 29 '22 15:03 fargue

@fargue @tyrken We only deletes the LoadBalancer when deleting the Ingresses. The Listener/ListenerRules should automatically be deleted when the LoadBalancer is deleted.

I suspect it's eventually consistency issue with resourcegroupstaggingapi, can you confirm whether these resources eventually go away or stuck in your account?

if it's stuck, could you help run aws elbv2 describe-listeners for the ARN returned?

Mar 30 '22 22:03 M00nF1sh

According to AWS support the listener/listener-rule resources themselves are deleted when the loadbalancer is, but the resourcegrouptaggingapi records get left behind. They don't ever seem to get cleared up - at least we have some that are at a guess months old.

aws elbv2 describe-listeners fails for the listener resource ARNs found. Interestingly you can aws elbv2 delete-rule one of the listener-rule ARNs repeatedly without error. Doing so does clean up the resourcegroupstaggingapi record for that resource type but nothing gets through for the listener ones.

Apr 09 '22 15:04 tyrken

This appears to be an issue on the resource tagging apis. We are following up with the ELB team for further investigation.

Apr 20 '22 22:04 kishorj

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jul 19 '22 22:07 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Aug 18 '22 23:08 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Sep 17 '22 23:09 k8s-triage-robot

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sep 17 '22 23:09 k8s-ci-robot

aws-load-balancer-controller aws-load-balancer-controller copied to clipboard

Listener and listener-rule AWS tagging records leak when deleting an ALB

aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard