karpenter-provider-aws
karpenter-provider-aws copied to clipboard
fix: Add ClusterCIDRReady condition
Fixes #7875
Summary
This PR addresses an issue where EC2NodeClass resources using AL2023 AMI family would become stuck in a NotReady state after temporary EKS API unavailability. The problem occurs when attempting to resolve Cluster CIDR during reconciliation - if the API call fails, the resource remains in a NotReady state indefinitely, even after the API becomes available again.
Changes
- Added a new condition type ConditionTypeClusterCIDRReady to track the status of Cluster CIDR resolution separately
- Modified the Readiness controller to manage this specific condition instead of directly setting the general Ready condition
- Added the new condition to the list of required conditions in the validation controller
Tested by:
- Creating an EC2NodeClass with AL2023 AMI family
- Temporarily revoking EKS DescribeCluster permissions to simulate API unavailability
- Verifying the EC2NodeClass transitions to
NotReadystate - Restoring permissions
- Verifying the EC2NodeClass correctly transitions back to Ready state during the next reconciliation
Does this change impact docs?
- [x] Yes, PR includes docs updates
- [ ] Yes, issue opened: #
- [ ] No
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Deploy Preview for karpenter-docs-prod ready!
| Name | Link |
|---|---|
| Latest commit | f91373c943c414218d1ee44dc495146dbfa65950 |
| Latest deploy log | https://app.netlify.com/sites/karpenter-docs-prod/deploys/67ecc399b669720008285f59 |
| Deploy Preview | https://deploy-preview-7965--karpenter-docs-prod.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
Should be resolved by #8408