Recent change in eksctl might break source-controller discovery of OCI HelmRepositories on AWS ECR
Not really a bug, but something to be (or make users) aware of, if running on AWS EKS and using AWS ECR:
Until now, eksctl added a full ReadOnly AWS Policy to all nodes, thus also inherited by source-controller, that among others grants ecr:ListImages permissions.
Recently, they changed it to a more narrow "PullOnly" policy, that lacks this ListImages permission: https://github.com/eksctl-io/eksctl/pull/8386
Thus source-controller no longer can discover versions of Helm charts in AWS ECR OCI HelmRepositories (and just logs a 403 Permission Denied)
That change is a good thing, but users now have to give the flux-system/source-controller that permission "back" by making it an IAM ServiceAccount.
Example for eksctl ClusterConfig (that policy is the same as the new PullOnly policy, just with added ecr:ListImages):
...
- metadata:
name: "source-controller"
namespace: "flux-system"
attachPolicy: {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchGetImage",
"ecr:ListImages",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchImportUpstreamImage"
],
"Resource": "*"
}
]
}
...
With that, it seems to work again for me.
https://fluxcd.io/flux/integrations/aws/
https://fluxcd.io/flux/integrations/aws/#for-amazon-elastic-container-registry
@matheuscscp yes, until now, this policy was implicitly available on the NodeInstanceRole. Now one has to give it explicitly to source-controller (one way or another).
In Flux it doesn't even have to be granted to source-controller, it can be granted individually to CRs via ServiceAccounts. This is a feature we released in Flux 2.6. Please read the full docs above
Thanks for posting this @MartinEmrich indeed this is a breaking change that could affect so many Flux users.
Let's see if our integration test for node-level access is affected:
https://github.com/fluxcd/pkg/actions/runs/18170857558
If it doesn't we should probably review the permissions we are granting to the node role and possibly reduce and adapt the test somehow.
@matheuscscp we don't use eksctl in e2e so it will not fail unless Hashicorp copied the new policy from eksctl in the latest provider (which they will do at some point).
I think all Flux users which have single-tenant clusters rely on the node level permission. If ecr:ListImages is not longer granted to the nodes, then OCIRepository, HelmChart and HelmRepo (type: oci) will all fail.
Yes, the node instance role is assigned by the user on each managed node group, there's no general AWS default, only a recommendation (which is that new ...PullOnly Policy): https://docs.aws.amazon.com/eks/latest/userguide/create-node-role.html
eksctl (the AWS-endorsed CLI for managing AWS EKS clusters) just now follows this recommendation unless instructed otherwise.
If you provision your EKS cluster with other means (Terraform, Tofu, AWS CLI, API), this does not affect you.