containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[EKS] [request]: Reset Managed Node Group status

Open nsilve opened this issue 10 months ago • 0 comments

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request We would like to have an alternative way to make an EKS Managed Node Group active/healthy again after a scale out failure.

Which service(s) is this request for? EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Right now, when an EKS Managed Node Group does not succeed to scale out (due to capacity unavailability), its status becomes degraded. The only way to reset its status back to active/healthy is to successfully scale out. Even if we scale this EKS Managed Node Group to zero, its status remains degraded. So in case that AWS runs out of capacity for the specific instance type/az of this EKS Managed Node Group, it would remain in degraded status forever. We would like to have an alternative way to reset its status back to active/healthy status. The problem is that we need to make some exceptions to our tools/monitoring e.g. for EKS Managed Node Group scaled to zero but in degraded status.

Are you currently working around this issue? We implement some very custom exceptions to our tools/monitoring.

nsilve avatar Apr 08 '24 10:04 nsilve