aws-load-balancer-controller icon indicating copy to clipboard operation
aws-load-balancer-controller copied to clipboard

Self Healing: Allow AWS Controller to Detect and Fix AWS Resource Changes on Interval

Open lucas-howard-macmillan opened this issue 3 years ago • 2 comments

Is your feature request related to a problem?

If a load balancer is deleted through the AWS Console, the AWS load balancer does not notice or re-create the load balancer.

The AWS load balancer controller must be restarted, and then the missing load balancer is recreated.

Describe the solution you'd like

An argument that could be passed into the controller indicating that it should do a full scan of AWS on a certain interval in an attempt to detect and fix drift within AWS from the expected state.

This would basically emulate the behavior that the AWS load balancer controller does when it starts up.

Potentially, for large deployments, you might also need a segment size argument as well i.e.

Every 5 minutes scan AWS for 100 ingresses, then the next 5 minutes the next 100 ingresses etc..

Describe alternatives you've considered

I have used all existing arguments, such as sync period, but none of them cause the load balancer to be re-created.

lucas-howard-macmillan avatar Sep 14 '22 17:09 lucas-howard-macmillan

/assign @M00nF1sh Investigate further on the periodic sync issue This is similar to https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2515

kishorj avatar Sep 14 '22 22:09 kishorj

Experienced the same behaviour. I assumed when I delete the LB via the AWS Console the ALB Controller would automatically recreate, however it did not.

lukonjun avatar Oct 04 '22 07:10 lukonjun

I'm experiencing the same issue. It only gets recovered when the number of replicas behind the service is changed.

I expected it would get recovered every 200s according to below https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/ec3418567841c1d36caf493c76105baf5e337b98/pkg/deploy/elbv2/target_group_binding_manager.go#L21-L26

dongho-jung avatar Oct 25 '22 08:10 dongho-jung

Experienced the same by accident, removed the wrong ALB from the AWS console and the lb-controller only recreates the ALB the moment you restart the lb-controller deployment.

ChrisV78 avatar Nov 18 '22 12:11 ChrisV78

Yes, that's unfortunately the only fix which is possible at the moment. I think in an older version (before the renaming) it was possible to get it recreated automatically. Can this please be fixed?

mxkmp avatar Nov 23 '22 07:11 mxkmp

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Feb 21 '23 08:02 k8s-triage-robot

/remove-lifecycle stale

On Tue, Feb 21, 2023 at 2:08 AM Kubernetes Triage Robot < @.***> wrote:

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Please send feedback to sig-contributor-experience at kubernetes/community https://github.com/kubernetes/community.

/lifecycle stale

— Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2800#issuecomment-1438014716, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJWRH7766EVOPYGH7WOQJ23WYRZZRANCNFSM6AAAAAAQMU33SM . You are receiving this because you authored the thread.Message ID: <kubernetes-sigs/aws-load-balancer-controller/issues/2800/1438014716@ github.com>

lucas-howard-macmillan avatar Feb 21 '23 15:02 lucas-howard-macmillan

Yes, that's unfortunately the only fix which is possible at the moment. I think in an older version (before the renaming) it was possible to get it recreated automatically. Can this please be fixed?

In previous versions, it did automatically modify / recreate when it detected that AWS resources are not correct.

While running a previous version, we had an issue where multiple load balancers were accidentally deleted, and by the time we were notified there was issue, the controller had already re-created them.

lucas-howard-macmillan avatar Mar 15 '23 13:03 lucas-howard-macmillan

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jun 13 '23 13:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Jul 13 '23 13:07 k8s-triage-robot

Hi, we have shipped the fix in v2.5.4, please check the details in our release note: https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases/tag/v2.5.4. I'm closing this ticket as of now, please feel free to reach out or reopen if you have any issues. Thanks

oliviassss avatar Jul 17 '23 18:07 oliviassss

@oliviassss Thank You!

lucas-howard-macmillan avatar Sep 27 '23 20:09 lucas-howard-macmillan