aws-load-balancer-controller Self Healing: Allow AWS Controller to Detect and Fix AWS Resource Changes on Interval

Is your feature request related to a problem?

If a load balancer is deleted through the AWS Console, the AWS load balancer does not notice or re-create the load balancer.

The AWS load balancer controller must be restarted, and then the missing load balancer is recreated.

Describe the solution you'd like

An argument that could be passed into the controller indicating that it should do a full scan of AWS on a certain interval in an attempt to detect and fix drift within AWS from the expected state.

This would basically emulate the behavior that the AWS load balancer controller does when it starts up.

Potentially, for large deployments, you might also need a segment size argument as well i.e.

Every 5 minutes scan AWS for 100 ingresses, then the next 5 minutes the next 100 ingresses etc..

Describe alternatives you've considered

I have used all existing arguments, such as sync period, but none of them cause the load balancer to be re-created.

Sep 14 '22 17:09 lucas-howard-macmillan

/assign @M00nF1sh Investigate further on the periodic sync issue This is similar to https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2515

Sep 14 '22 22:09 kishorj

Experienced the same behaviour. I assumed when I delete the LB via the AWS Console the ALB Controller would automatically recreate, however it did not.

Oct 04 '22 07:10 lukonjun

I'm experiencing the same issue. It only gets recovered when the number of replicas behind the service is changed.

I expected it would get recovered every 200s according to below https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/ec3418567841c1d36caf493c76105baf5e337b98/pkg/deploy/elbv2/target_group_binding_manager.go#L21-L26

Oct 25 '22 08:10 dongho-jung

Experienced the same by accident, removed the wrong ALB from the AWS console and the lb-controller only recreates the ALB the moment you restart the lb-controller deployment.

Nov 18 '22 12:11 ChrisV78

Yes, that's unfortunately the only fix which is possible at the moment. I think in an older version (before the renaming) it was possible to get it recreated automatically. Can this please be fixed?

Nov 23 '22 07:11 mxkmp

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Feb 21 '23 08:02 k8s-triage-robot

/remove-lifecycle stale

On Tue, Feb 21, 2023 at 2:08 AM Kubernetes Triage Robot < @.***> wrote:

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale

Close this issue with /close

Offer to help out with Issue Triage https://www.kubernetes.dev/docs/guide/issue-triage/

Please send feedback to sig-contributor-experience at kubernetes/community https://github.com/kubernetes/community.

/lifecycle stale

— Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2800#issuecomment-1438014716, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJWRH7766EVOPYGH7WOQJ23WYRZZRANCNFSM6AAAAAAQMU33SM . You are receiving this because you authored the thread.Message ID: <kubernetes-sigs/aws-load-balancer-controller/issues/2800/1438014716@ github.com>

Feb 21 '23 15:02 lucas-howard-macmillan

Yes, that's unfortunately the only fix which is possible at the moment. I think in an older version (before the renaming) it was possible to get it recreated automatically. Can this please be fixed?

In previous versions, it did automatically modify / recreate when it detected that AWS resources are not correct.

While running a previous version, we had an issue where multiple load balancers were accidentally deleted, and by the time we were notified there was issue, the controller had already re-created them.

Mar 15 '23 13:03 lucas-howard-macmillan

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jun 13 '23 13:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Jul 13 '23 13:07 k8s-triage-robot

Hi, we have shipped the fix in v2.5.4, please check the details in our release note: https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases/tag/v2.5.4. I'm closing this ticket as of now, please feel free to reach out or reopen if you have any issues. Thanks

Jul 17 '23 18:07 oliviassss

@oliviassss Thank You!

Sep 27 '23 20:09 lucas-howard-macmillan

aws-load-balancer-controller aws-load-balancer-controller copied to clipboard

Self Healing: Allow AWS Controller to Detect and Fix AWS Resource Changes on Interval

aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard