community icon indicating copy to clipboard operation
community copied to clipboard

IAM Controller 1.3.8 reconciler reports duplicate IAM policies

Open mikeroth opened this issue 1 year ago • 3 comments

Describe the bug When I create an IAM policy it gets created in AWS console and the policy object get created on the cluster but reports duplicate names when none exist. This also prevents the creation of the role.

Steps to reproduce

I created a helm chart which creates an IAM Policy and Role. When I deployed the chart the status shows

Status:
  Ack Resource Metadata:
    Owner Account ID:  123456789123
    Region:            us-west-2
  Conditions:
    Message:               EntityAlreadyExists: A policy called foobar already exists. Duplicate names are not allowed.
                           status code: 409, request id: 676eabc9-bd60-4d2f-857d-b243ea8b04e9
    Status:                True
    Type:                  ACK.Recoverable
    Last Transition Time:  2024-06-27T20:09:59Z
    Message:               Unable to determine if desired resource state matches latest observed state
    Reason:                EntityAlreadyExists: A policy called foobar already exists. Duplicate names are not allowed.
                           status code: 409, request id: 676eabc9-bd60-4d2f-857d-b243ea8b04e9
    Status:                Unknown
    Type:                  ACK.ResourceSynced
Events:                    <none>

I check the AWS console and it created the IAM policy but the reconciler was trying to create it again which it why it reports a duplicate.

I checked previous IAM policies and they were getting this message now.

So in ArgoCD I disable auto sync, then removed the policy from the IAM console, and I removed the policy object from the cluster. I did a sync again and I got the same results.

I rolled back to version 1.3.4 and my IAM policy and role was able to be created.

Expected outcome IAM policy should be created and reconciler should stop trying to create it again which triggers the duplicate error.

Environment

  • Kubernetes version? v1.27.13-eks-3af4770
  • Using EKS (yes, if so version? 1.28
  • AWS service targeted (IAM)

mikeroth avatar Jun 27 '24 20:06 mikeroth

Thanks for reporting this @mikeroth - can you please share an example yaml file I could use to reproduce this issue locally?

a-hilaly avatar Jun 27 '24 21:06 a-hilaly

Hi @a-hilaly,

I have this as a template in a helm chart being deployed by argocd but it looks like this below if I was directly applying it.

apiVersion: iam.services.k8s.aws/v1alpha1
kind: Policy
metadata:
  name: name-policy
spec:
  name: name-policy
  description: Policy Description
  policyDocument: |
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "elasticloadbalancing:DeregisterTargets",
            "elasticloadbalancing:RegisterTargets",
            "elasticloadbalancing:DescribeTargetGroups"
          ],
          "Resource": [
            "arn:aws:elasticloadbalancing:us-west-2:123456789101:targetgroup/pref-20240628220400505100000001/*",
            "arn:aws:elasticloadbalancing:us-west-2:123456789101:loadbalancer/net/moniker/*"
          ]
        }
      ]
    }
  tags:
  - key: moniker
    value: moniker

mikeroth avatar Jun 28 '24 16:06 mikeroth

@a-hilaly we are seeing the same issue in 1.3.4 as well when IAM controller wih 3 replicas. the only way to recover is to delete the aws policy and then let it sync. We do use argocd to manage these policy can that be a issue?

alagukannan avatar Sep 19 '24 23:09 alagukannan

Issues go stale after 180d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 60d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle stale

ack-bot avatar Mar 19 '25 01:03 ack-bot

Stale issues rot after 60d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 60d of inactivity. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle rotten

ack-bot avatar May 18 '25 02:05 ack-bot

/close

rushmash91 avatar Jun 11 '25 02:06 rushmash91

@rushmash91: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ack-prow[bot] avatar Jun 11 '25 02:06 ack-prow[bot]