community
community copied to clipboard
EKS stops reconcile after ACK.Terminal status condition
Describe the bug A concise description of what the bug is.
We have created a cluster using below details (please note , role arn provided in cluster definition).
if role does not exists when cluster is created (race condition), it shows ACK.Terminal condition in cluster status and never gets resolved even role is created successfully in next 1-2 seconds.
Both eks and iam controllers are configured to reconcile every 10 to 20 seconds (configuration attached in next section).
however if i restart eks controller by deleting pod, it reconclies successfully and removes ACK.Terminal condition. This solution is not practical as we can not keep restarting pod for every change in yaml.
Steps to reproduce
step 1 : create cluster first Step 2: create role
apiVersion: eks.services.k8s.aws/v1alpha1
kind: Cluster
metadata:
annotations:
services.k8s.aws/deletion-policy: delete
finalizers:
- finalizers.eks.services.k8s.aws/Cluster
name: moon
namespace: control
spec:
kubernetesNetworkConfig:
ipFamily: ipv4
serviceIPv4CIDR: 172.20.0.0/16
logging:
clusterLogging:
- enabled: true
types:
- api
- audit
- authenticator
- controllerManager
- scheduler
name: moon
resourcesVPCConfig:
endpointPrivateAccess: true
endpointPublicAccess: true
publicAccessCIDRs:
- 123.45.67.89/32
securityGroupIDs:
- sg-123
subnetIDs:
- subnet-123
- subnet-456
- subnet-789
roleARN: arn:aws:iam::1234567890:role/moon-eks-cluster
version: "1.25"
status:
ackResourceMetadata:
ownerAccountID: "1234567890"
region: eu-central-1
conditions:
- message: |-
InvalidParameterException: The provided role doesn't have the Amazon EKS Managed Policies associated with it. Please ensure the following policies [arn:aws:iam::aws:policy/AmazonEKSClusterPolicy] are attached
{
RespMetadata: {
StatusCode: 400,
RequestID: "aacb3dc6-6bdd-4031-a67e-ae6d461f7e4b"
},
ClusterName: "moon",
Message_: "The provided role doesn't have the Amazon EKS Managed Policies associated with it. Please ensure the following policies [arn:aws:iam::aws:policy/AmazonEKSClusterPolicy] are attached"
}
status: "True"
type: ACK.Terminal
- lastTransitionTime: "2023-07-11T09:39:55Z"
message: Resource not synced
reason: resource is in terminal condition
status: "False"
type: ACK.ResourceSynced
role definition
apiVersion: iam.services.k8s.aws/v1alpha1
kind: Role
metadata:
annotations:
services.k8s.aws/deletion-policy: delete
finalizers:
- finalizers.iam.services.k8s.aws/Role
name: moon-eks-cluster
namespace: control
spec:
assumeRolePolicyDocument: |-
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "EKSClusterAssumeRole",
"Effect": "Allow",
"Principal": {
"Service": "eks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
description: IAM role that is used by an eks cluster.
inlinePolicies:
cluster-elb-sl: |-
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"ec2:DescribeInternetGateways",
"ec2:DescribeAddresses",
"ec2:DescribeAccountAttributes"
],
"Effect": "Allow",
"Resource": "*",
"Sid": ""
}
]
}
maxSessionDuration: 3600
name: moon-eks-cluster
path: /
policies:
- arn:aws:iam::aws:policy/AmazonEKSClusterPolicy
- arn:aws:iam::aws:policy/AmazonEKSServicePolicy
- arn:aws:iam::aws:policy/AmazonEKSVPCResourceController
status:
ackResourceMetadata:
arn: arn:aws:iam::1234567890:role/moon-eks-cluster
ownerAccountID: "1234567890"
region: eu-central-1
conditions:
- lastTransitionTime: "2023-07-11T10:11:59Z"
message: Late initialization successful
reason: Late initialization successful
status: "True"
type: ACK.LateInitialized
- lastTransitionTime: "2023-07-11T10:11:59Z"
message: Resource synced successfully
reason: ""
status: "True"
type: ACK.ResourceSynced
createDate: "2023-07-11T09:39:54Z"
roleID: XXXXXXXXXXXXXXXXX
roleLastUsed: {}
Both eks and iam controller are configured to reconcile every 10 to 20 seconds.
i.e eks helm chart values when installing controller
reconcile:
resourceResyncPeriods: {
Nodegroup: 10,
Cluster: 20,
Addon: 15
}
iam helm chart values when installing controller
reconcile:
resourceResyncPeriods: {
Role: 10
}
Expected outcome A concise description of what you expected to happen. As eks controller is configured to reconclile every 20 seconds, it should automatiicay sync in next reconcile loop after role is available.
Environment dev
- Kubernetes version 1.25
- Using EKS (yes/no), if so version? 1.25
- AWS service targeted (S3, RDS, etc.) eks, iam