IAM Controller 1.3.8 reconciler reports duplicate IAM policies
Describe the bug When I create an IAM policy it gets created in AWS console and the policy object get created on the cluster but reports duplicate names when none exist. This also prevents the creation of the role.
Steps to reproduce
I created a helm chart which creates an IAM Policy and Role. When I deployed the chart the status shows
Status:
Ack Resource Metadata:
Owner Account ID: 123456789123
Region: us-west-2
Conditions:
Message: EntityAlreadyExists: A policy called foobar already exists. Duplicate names are not allowed.
status code: 409, request id: 676eabc9-bd60-4d2f-857d-b243ea8b04e9
Status: True
Type: ACK.Recoverable
Last Transition Time: 2024-06-27T20:09:59Z
Message: Unable to determine if desired resource state matches latest observed state
Reason: EntityAlreadyExists: A policy called foobar already exists. Duplicate names are not allowed.
status code: 409, request id: 676eabc9-bd60-4d2f-857d-b243ea8b04e9
Status: Unknown
Type: ACK.ResourceSynced
Events: <none>
I check the AWS console and it created the IAM policy but the reconciler was trying to create it again which it why it reports a duplicate.
I checked previous IAM policies and they were getting this message now.
So in ArgoCD I disable auto sync, then removed the policy from the IAM console, and I removed the policy object from the cluster. I did a sync again and I got the same results.
I rolled back to version 1.3.4 and my IAM policy and role was able to be created.
Expected outcome IAM policy should be created and reconciler should stop trying to create it again which triggers the duplicate error.
Environment
- Kubernetes version? v1.27.13-eks-3af4770
- Using EKS (yes, if so version? 1.28
- AWS service targeted (IAM)
Thanks for reporting this @mikeroth - can you please share an example yaml file I could use to reproduce this issue locally?
Hi @a-hilaly,
I have this as a template in a helm chart being deployed by argocd but it looks like this below if I was directly applying it.
apiVersion: iam.services.k8s.aws/v1alpha1
kind: Policy
metadata:
name: name-policy
spec:
name: name-policy
description: Policy Description
policyDocument: |
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"elasticloadbalancing:DeregisterTargets",
"elasticloadbalancing:RegisterTargets",
"elasticloadbalancing:DescribeTargetGroups"
],
"Resource": [
"arn:aws:elasticloadbalancing:us-west-2:123456789101:targetgroup/pref-20240628220400505100000001/*",
"arn:aws:elasticloadbalancing:us-west-2:123456789101:loadbalancer/net/moniker/*"
]
}
]
}
tags:
- key: moniker
value: moniker
@a-hilaly we are seeing the same issue in 1.3.4 as well when IAM controller wih 3 replicas. the only way to recover is to delete the aws policy and then let it sync. We do use argocd to manage these policy can that be a issue?
Issues go stale after 180d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 60d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/aws-controllers-k8s/community.
/lifecycle stale
Stale issues rot after 60d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 60d of inactivity.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/aws-controllers-k8s/community.
/lifecycle rotten
/close
@rushmash91: Closing this issue.
In response to this:
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.