crossplane icon indicating copy to clipboard operation
crossplane copied to clipboard

Please support irsa within the crossplane container for fetching AWS ECR hosted OCI images

Open jwitko opened this issue 1 year ago • 37 comments

What problem are you facing?

I used xpkg to build and push a package OCI image to AWS ECR. When I went to go use it I could not get it from AWS ECR even though the crossplane pod has a service account that has all required auth via irsa. The issue seems to be that crossplane only supports imagePullSecrets style secrets and will not natively perform the docker login. This presents a significant hurdle because imagePullSecrets does not automatically update when tokens expire.

How could Crossplane help solve your problem?

Crossplane could support ECR login when auth is provided via IRSA instead of only relying on docker-registry secret types.

jwitko avatar Nov 26 '24 04:11 jwitko

this is supported - and working Can you share your Setup of the IRSA Role ? If you using vpc endpoints make Sure your crossplane can can reach the vpc endpoints for ecr

Ref: https://github.com/crossplane/crossplane/blob/main/internal/xpkg/fetch.go#L131 https://github.com/google/go-containerregistry/blob/main/pkg%2Fauthn%2Fk8schain%2FREADME.md

the keychain also includes cloud-specific credential helpers for Google Container Registry (and Artifact Registry), Azure Container Registry, and Amazon AWS Elasic Container Registry. This means that if the keychain is used from within Kubernetes services on those clouds (GKE, AKS, EKS), any available service credentials will be discovered and used

haarchri avatar Nov 27 '24 22:11 haarchri

crossplane will fetch the package via IRSA and then Setup CRDs + Deployment - what is configured for the deployment depends on DeploymentRuntimeConfig

haarchri avatar Nov 27 '24 22:11 haarchri

this is supported - and working Can you share your Setup of the IRSA Role ? If you using vpc endpoints make Sure your crossplane can can reach the vpc endpoints for ecr

Ref: https://github.com/crossplane/crossplane/blob/main/internal/xpkg/fetch.go#L131 https://github.com/google/go-containerregistry/blob/main/pkg%2Fauthn%2Fk8schain%2FREADME.md

the keychain also includes cloud-specific credential helpers for Google Container Registry (and Artifact Registry), Azure Container Registry, and Amazon AWS Elasic Container Registry. This means that if the keychain is used from within Kubernetes services on those clouds (GKE, AKS, EKS), any available service credentials will be discovered and used

Crossplane does use the iRSA but this is not enough to grab a private ECR registry. You must also perform a login to the ECR repo to fetch a token.

jwitko avatar Nov 28 '24 04:11 jwitko

we using IRSA to pull packages from private ECR with simple annotate the serviceAccount with an role + policy

Can you share your issue ? kubectl get pkgrev ?

haarchri avatar Nov 28 '24 06:11 haarchri

We have one missleading logging/ discard logging around which reminds me of: https://github.com/crossplane/crossplane/issues/5805

haarchri avatar Nov 28 '24 06:11 haarchri

I'm installing crossplane via the official helm chart. I provide the following relevant values for configuring iRSA:

  extraObjects:
    - apiVersion: pkg.crossplane.io/v1beta1
      kind: DeploymentRuntimeConfig
      metadata:
        name: irsa-runtimeconfig
      spec:
        serviceAccountTemplate:
          metadata:
            annotations:
              eks.amazonaws.com/role-arn: arn:aws:iam::redacted:role/redacted

    - apiVersion: aws.upbound.io/v1beta1
      kind: ProviderConfig
      metadata:
        name: default
      spec:
        credentials:
          source: IRSA

    - apiVersion: pkg.crossplane.io/v1
      kind: Provider
      metadata:
        name: provider-aws-s3
      spec:
        package: xpkg.upbound.io/upbound/provider-aws-s3:v1.17.0
        runtimeConfigRef:
          name: irsa-runtimeconfig

And this all works fine and well until I attempted to install an internally managed Configuration from a private AWS ECR. I was met with errors about IAM privileges. At first it wasn't clear who was attempting to leverage ECR but eventually I figured out it was the crossplane pod itself. I added an iRSA policy to the service account being used by crossplane:

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::redacted:role/redacted-crossplane
  name: crossplane

Once I did this all IAM errors went away from downloading ECR Configurations but I was now met with a new error. 401 Unauthorized. This is because simply having IAM permissions to use ECR is not enough. You also have to create a token. On a CLI this commonly looks like: aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com

Because the crossplane code doesn't seem to handle this scenario I had to resort to a CronJob using the IAM role which creates a K8s secret with short-lived credentials on a schedule and then tell the crossplane service account to use the secret as the imagePullSecret.

jwitko avatar Dec 02 '24 18:12 jwitko

this is exactly handled by this code path:

this is supported - and working Can you share your Setup of the IRSA Role ? If you using vpc endpoints make Sure your crossplane can can reach the vpc endpoints for ecr

Ref: https://github.com/crossplane/crossplane/blob/main/internal/xpkg/fetch.go#L131 https://github.com/google/go-containerregistry/blob/main/pkg%2Fauthn%2Fk8schain%2FREADME.md

the keychain also includes cloud-specific credential helpers for Google Container Registry (and Artifact Registry), Azure Container Registry, and Amazon AWS Elasic Container Registry. This means that if the keychain is used from within Kubernetes services on those clouds (GKE, AKS, EKS), any available service credentials will be discovered and used

haarchri avatar Dec 02 '24 18:12 haarchri

this is exactly handled by this code path:

this is supported - and working Can you share your Setup of the IRSA Role ? If you using vpc endpoints make Sure your crossplane can can reach the vpc endpoints for ecr Ref: https://github.com/crossplane/crossplane/blob/main/internal/xpkg/fetch.go#L131 https://github.com/google/go-containerregistry/blob/main/pkg%2Fauthn%2Fk8schain%2FREADME.md

the keychain also includes cloud-specific credential helpers for Google Container Registry (and Artifact Registry), Azure Container Registry, and Amazon AWS Elasic Container Registry. This means that if the keychain is used from within Kubernetes services on those clouds (GKE, AKS, EKS), any available service credentials will be discovered and used

Thank you for linking that. The issue remains though? The same iRSA role is able to perform an ECR login and generate credentials from the cronJob but fails with a 401 Unauthorized when supplied to crossplane via irsa

jwitko avatar Dec 02 '24 19:12 jwitko

We are also seeing the same as @jwitko here so glad we aren't the only ones

dhumphries-sainsburys avatar Dec 19 '24 10:12 dhumphries-sainsburys

i shared with @jwitko an image to remove the io.discard that we can see some more insights

are the Node Roles with an ECR Policy? Because they need to pull the pod image too

If you using some VPC Endpoints for ECR are the crossplane pod allowed to reach this VPC Endpoint?

haarchri avatar Dec 19 '24 10:12 haarchri

Nodes will have a valid ECR policy as this is the only thing with an issue and i suppose the same could be said for any endpoints (although looks like we aren't using any related to this). Looking at the linked issue i haven't built a custom image but it looks to be the same thing as this and my own issue

dhumphries-sainsburys avatar Dec 19 '24 11:12 dhumphries-sainsburys

Crossplane does not currently have enough maintainers to address every issue and pull request. This issue has been automatically marked as stale because it has had no activity in the last 90 days. It will be closed in 14 days if no further activity occurs. Leaving a comment starting with /fresh will mark this issue as not stale.

github-actions[bot] avatar Mar 20 '25 01:03 github-actions[bot]

@haarchri - Can we get this reopened as it is still an issue even on the latest release and whilst we worked around it at the time we have found a new requirement that is harder to sidestep so really would need to be able to pull from ECR using role assumption

Config is identical to that which @jwitko shared (except it is a package we are trying to pull from ECR). IAM perms are blanket allow to ensure it isn't anything like that

dhumphries-sainsburys avatar May 13 '25 09:05 dhumphries-sainsburys

@haarchri - Considering we know what config doesn't seem to be working but you say it should can we look at this the other way around and you provide some config that is verified as working so we can see what we might be missing?

dhumphries-sainsburys avatar May 14 '25 07:05 dhumphries-sainsburys

this will work with IRSA and PodIdentity - you need a IAM Role for your crossplane service account - something like:

          {
            "Version": "2012-10-17",
            "Statement": [
              {
                "Effect": "Allow",
                "Action": [
                  "ecr:GetAuthorizationToken"
                ],
                "Resource": "*"
              },
              {
                "Effect": "Allow",
                "Action": [
                  "ecr:BatchCheckLayerAvailability",
                  "ecr:GetDownloadUrlForLayer",
                  "ecr:BatchGetImage"
                ],
                "Resource": "arn:aws:ecr:*:*:repository/*"
              }
            ]
          }

Amazon EKS workloads hosted on managed or self-managed nodes: The Amazon EKS worker node IAM role (NodeInstanceRole) is required. The Amazon EKS worker node IAM role must contain the following IAM policy permissions for Amazon ECR: https://docs.aws.amazon.com/AmazonECR/latest/userguide/ECR_on_EKS.html

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ecr:BatchCheckLayerAvailability",
                "ecr:BatchGetImage",
                "ecr:GetDownloadUrlForLayer",
                "ecr:GetAuthorizationToken"
            ],
            "Resource": "*"
        }
    ]
}

then mirror a xpkg to ecr - something like this - remember you need to create the ECR Image Repo first:

crane copy xpkg.upbound.io/upbound/configuration-aws-eks:v0.18.2 123456789012.dkr.ecr.us-west-2.amazonaws.com/upbound/configuration-aws-eks:v0.18.2

create the configuration - something like this:

apiVersion: pkg.crossplane.io/v1
kind: Configuration
metadata:
  name: upbound-configuration-aws-eks
spec:
  package: 123456789012.dkr.ecr.us-west-2.amazonaws.com/upbound/configuration-aws-eks:v0.18.2

check the result:

kubectl get pkgrev                                                                                 
NAME                                                                                     HEALTHY   REVISION   IMAGE                                                                                STATE    DEP-FOUND   DEP-INSTALLED   AGE
configurationrevision.pkg.crossplane.io/upbound-configuration-aws-eks-4931c331e716       True      1          123456789012.dkr.ecr.us-west-2.amazonaws.com/upbound/configuration-aws-eks:v0.18.2   Active   10          10              4m16s
[...]

so nothing special Crossplane Pod will Pull the xpkg from ECR and then create the Deployment and the EKS Nodes will pull from ECR the Pod Image

haarchri avatar Jun 09 '25 09:06 haarchri

here a full run with PodIdentity:

Create a Kind Cluster:

kind create cluster --name chris-cluster      

Install Crossplane:

helm install crossplane \
--namespace crossplane-system \
--create-namespace crossplane-stable/crossplane

Create Configurations:

apiVersion: pkg.crossplane.io/v1
kind: Configuration
metadata:
  name: upbound-configuration-aws-eks
spec:
  package: xpkg.upbound.io/upbound/configuration-aws-eks:v0.18.2
---
apiVersion: pkg.crossplane.io/v1
kind: Configuration
metadata:
  name: upbound-configuration-aws-eks-pod-identity
spec:
  package: xpkg.upbound.io/upbound/configuration-aws-eks-pod-identity:v0.7.0

Create ProviderConfig:

  • adjust for your needs:
apiVersion: aws.upbound.io/v1beta1
kind: ProviderConfig
metadata:
  name: default
spec:
  credentials:
    source: Upbound
    upbound:
      webIdentity:
        roleARN: "arn:aws:iam::123456789012:role/chris-cluster"

Create Network & Cluster:

apiVersion: aws.platform.upbound.io/v1alpha1
kind: XNetwork
metadata:
  name: haarchri-ecr-pull
spec:
  parameters:
    id: haarchri-ecr-pull
    region: us-west-2
---
apiVersion: aws.platform.upbound.io/v1alpha1
kind: XEKS
metadata:
  name: haarchri-ecr-pull
spec:
  parameters:
    id: haarchri-ecr-pull
    region: us-west-2
    version: "1.27"
    accessConfig:
      authenticationMode: API_AND_CONFIG_MAP
      bootstrapClusterCreatorAdminPermissions: true
    nodes:
      count: 1
      instanceType: t3.small
    iam:
      principalArn: arn:aws:iam::123456789012:role/aws-reserved/sso.amazonaws.com/AWSReservedSSO_AdministratorAccess_aaaaaaaaaaaa
  writeConnectionSecretToRef:
    name: haarchri-ecr-pull-kubeconfig
    namespace: upbound-system

Create PodIdentity:

apiVersion: aws.platform.upbound.io/v1alpha1
kind: XPodIdentity
metadata:
  name: haarchri-ecr-pull
spec:
  parameters:
    region: us-west-2
    clusterNameSelector:
      matchLabels:
        crossplane.io/composite: haarchri-ecr-pull
    inlinePolicy:
      - name: default
        policy: |
          {
            "Version": "2012-10-17",
            "Statement": [
              {
                "Effect": "Allow",
                "Action": [
                  "ecr:GetAuthorizationToken"
                ],
                "Resource": "*"
              },
              {
                "Effect": "Allow",
                "Action": [
                  "ecr:BatchCheckLayerAvailability",
                  "ecr:GetDownloadUrlForLayer",
                  "ecr:BatchGetImage"
                ],
                "Resource": "arn:aws:ecr:*:*:repository/*"
              }
            ]
          }
    serviceAccount:
      name: crossplane
      namespace: crossplane-system

mirror to ECR:

aws ecr get-login-password --region us-west-2 --profile AdministratorAccess-123456789012 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-west-2.amazonaws.com
crane copy xpkg.upbound.io/upbound/configuration-aws-eks:v0.18.2 123456789012.dkr.ecr.us-west-2.amazonaws.com/upbound/configuration-aws-eks:v0.18.2

when all XRs & MRs ready - connect to the EKS Cluster - something like:

aws sso login --profile login
aws eks update-kubeconfig --region us-west-2  --name haarchri-ecr-pull-xrvxv --profile AdministratorAccess-123456789012

install crossplane:

helm install crossplane \
--namespace crossplane-system \
--create-namespace crossplane-stable/crossplane

create configuration with ecr package location:

apiVersion: pkg.crossplane.io/v1
kind: Configuration
metadata:
  name: upbound-configuration-aws-eks
spec:
  package: 123456789012.dkr.ecr.us-west-2.amazonaws.com/upbound/configuration-aws-eks:v0.18.2

check the results:

kubectl get pkgrev                                                                                 
NAME                                                                                     HEALTHY   REVISION   IMAGE                                                                                STATE    DEP-FOUND   DEP-INSTALLED   AGE
configurationrevision.pkg.crossplane.io/upbound-configuration-aws-eks-4931c331e716       True      1          123456789012.dkr.ecr.us-west-2.amazonaws.com/upbound/configuration-aws-eks:v0.18.2   Active   10          10              4m16s
[...]

haarchri avatar Jun 09 '25 09:06 haarchri

@dhumphries-sainsburys can share a bit more of you setup ? do you using VPC Endpoints for STS, ECR ? in case yes - are the Cluster SecurityGroups allowed to use the VPC Endpoints ??

haarchri avatar Jun 09 '25 09:06 haarchri

for debug i removed the io.Discard https://github.com/crossplane/crossplane/blob/release-1.20/internal/xpkg/fetch.go#L37-L44 that we get a bit more output in your environments - based on release-1.20 please run crossplane with this debug image:

docker.io/haarchri/crossplane:release-1.20

haarchri avatar Jun 09 '25 09:06 haarchri

Attempted the steps in https://github.com/crossplane/crossplane/issues/6137#issuecomment-2955208289

Have the role (we are pulling from gcr rather than xpkg)

        {
            "Action": [
                "ec2:UpdateSecurityGroup*",
                "ec2:RevokeSecurityGroup*",
                "ec2:ModifySecurityGroup*",
                "ec2:DescribeSecurityGroup*",
                "ec2:DeleteTags",
                "ec2:DeleteSecurityGroup",
                "ec2:CreateTags",
                "ec2:CreateSecurityGroup",
                "ec2:AuthorizeSecurityGroup*"
            ],
            "Effect": "Allow",
            "Resource": "*"
        },
        {
            "Action": [
                "ecr:UntagResource",
                "ecr:TagResource",
                "ecr:PutImageTagMutability",
                "ecr:PutImageScanningConfiguration",
                "ecr:ListTagsForResource",
                "ecr:GetAuthorizationToken",
                "ecr:DescribeRepositories",
                "ecr:CreateRepository",
                "ecr:*RepositoryPolicy",
                "ecr:*LifecyclePolicy"
            ],
            "Effect": "Allow",
            "Resource": "*"
        },
        {
            "Action": "ecr:*",
            "Effect": "Allow",
            "Resource": "arn:aws:ecr:eu-west-1:<account>:repository/bosun-image-cache/ghcr/*"
        },

Node role is using AmazonEC2ContainerRegistryReadOnly which includes all the permissions you

Made sure the serviceaccount is annotated with the IRSA role

We have a pull-through cache configured in ECR that is verified as being working by taking crossplane out of the equation

I then create a configuration as you show (we just have providers currently) and all return

cannot unpack package: failed to fetch package digest from remote: failed to fetch package descriptor with a GET request after a previous HEAD request failure: HEAD https://<account>.dkr.ecr.eu-west-1.amazonaws.com/v2/bosun-image-cache/ghcr/crossplane-contrib/provider-aws-ecr/manifests/v1.21.0: unexpected status code 401 Unauthorized (HEAD responses have no body, use GET for details): GET https://<account>.dkr.ecr.eu-west-1.amazonaws.com/v2/bosun-image-cache/ghcr/crossplane-contrib/provider-aws-ecr/manifests/v1.21.0: unexpected status code 401 Unauthorized: Not Authorized

No endpoint in play for either service

It still just seems that crossplane just doesn't try and authenticate which is reinforced by IAM claiming it has never even tried to access GetAuthorizationToken which i assume is how it is authenticating

Running the debug image makes no difference the logs are identical

Installed using helm as so

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: helm-crossplane
  namespace: crossplane-system
spec:
  interval: 5m
  chart:
    spec:
      chart: crossplane
      version: "1.20.0"
      sourceRef:
        kind: HelmRepository
        name: crossplane-stable
      interval: 1m
  values:
    args:
      - "--debug"
    serviceAccount:
      customAnnotations:
        "eks.amazonaws.com/role-arn": arn:aws:iam::${account_number}:role/crossplane
    metrics:
      enabled: true
    image:
      repository: docker.io/haarchri/crossplane
      tag: release-1.20
    resourcesCrossplane:
      requests:
        cpu: "0.1"
        memory: "256Mi"
      limits:
        cpu: "1.0"
        memory: "512Mi"
    resourcesRBACManager:
      requests:
        cpu: "0.1"
        memory: "256Mi"
      limits:
        cpu: "0.5"
        memory: "512Mi"

Logs are a loop of

2025-06-11T11:03:57Z	INFO	crossplane	Beta feature enabled	{"flag": "EnableBetaCompositionWebhookSchemaValidation"}
2025-06-11T11:03:57Z	INFO	crossplane	Beta feature enabled	{"flag": "EnableBetaUsages"}
2025-06-11T11:03:57Z	INFO	crossplane	Beta feature enabled	{"flag": "EnableBetaRealtimeCompositions"}
2025-06-11T11:03:57Z	INFO	crossplane	Beta feature enabled	{"flag": "EnableBetaDeploymentRuntimeConfigs"}
2025-06-11T11:03:57Z	INFO	crossplane	Beta feature enabled	{"flag": "EnableBetaClaimSSA"}
2025-06-11T11:04:03Z	DEBUG	crossplane	Event(v1.ObjectReference{Kind:"Lease", Namespace:"crossplane-system", Name:"crossplane-leader-election-core", UID:"967e6590-2d50-41ca-94f6-91949b591f57", APIVersion:"coordination.k8s.io/v1", ResourceVersion:"659566431", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' crossplane-596867cc4b-29wqz_cdade9e2-fc1c-40bf-ade1-c8df4ff9df5a became leader
Warning: ControllerConfig.pkg.crossplane.io/v1alpha1 is deprecated. Use DeploymentRuntimeConfig from pkg.crossplane.io/v1beta1 instead.
2025-06-11T11:04:04Z	DEBUG	crossplane	Reconciling	{"controller": "packages/provider.pkg.crossplane.io", "request": {"name":"provider-aws-ecr"}}
2025-06-11T11:04:04Z	DEBUG	crossplane	Reconciling	{"controller": "packages/provider.pkg.crossplane.io", "request": {"name":"provider-aws-iam"}}
2025-06-11T11:04:04Z	DEBUG	crossplane	Reconciling	{"controller": "packages/provider.pkg.crossplane.io", "request": {"name":"provider-aws-s3"}}
2025-06-11T11:04:04Z	DEBUG	crossplane	Reconciling	{"controller": "packages/provider.pkg.crossplane.io", "request": {"name":"upbound-provider-family-aws"}}
2025-06-11T11:04:04Z	DEBUG	crossplane	Reconciling	{"controller": "packages/provider.pkg.crossplane.io", "request": {"name":"aws"}}
2025-06-11T11:04:04Z	DEBUG	crossplane	Reconciling	{"controller": "packages/provider.pkg.crossplane.io", "request": {"name":"provider-aws-dynamodb"}}
2025-06-11T11:04:04Z	DEBUG	crossplane	Reconciling	{"controller": "packages/lock.pkg.crossplane.io", "request": {"name":"lock"}}
2025-06-11T11:04:04Z	DEBUG	crossplane	Reconciling	{"controller": "packages/configuration.pkg.crossplane.io", "request": {"name":"upbound-provider-family-aws"}}

dhumphries-sainsburys avatar Jun 11 '25 10:06 dhumphries-sainsburys

Can you show a kubectl get pods ... -o yaml from the crossplane pod ?

haarchri avatar Jun 11 '25 11:06 haarchri

Sure:

apiVersion: v1
kind: Pod
metadata:
  name: crossplane-596867cc4b-j4hxg
  generateName: crossplane-596867cc4b-
  namespace: crossplane-system
  uid: 30839f44-4186-4cbe-8830-4f47d3fb49f3
  resourceVersion: '659688689'
  creationTimestamp: '2025-06-11T12:41:33Z'
  labels:
    app: crossplane
    app.kubernetes.io/component: cloud-infrastructure-controller
    app.kubernetes.io/instance: helm-crossplane
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: crossplane
    app.kubernetes.io/part-of: crossplane
    app.kubernetes.io/version: 1.20.0
    bosun.jspaas.uk/costcentre: PD7825
    helm.sh/chart: crossplane-1.20.0
    pod-template-hash: 596867cc4b
    release: helm-crossplane
  annotations:
    prometheus.io/path: /metrics
    prometheus.io/port: '8080'
    prometheus.io/scrape: 'true'
  ownerReferences:
    - apiVersion: apps/v1
      kind: ReplicaSet
      name: crossplane-596867cc4b
      uid: fbf20b2d-5b1b-405b-91fa-b144ec50f4a2
      controller: true
      blockOwnerDeletion: true
      subresource: status
  selfLink: /api/v1/namespaces/crossplane-system/pods/crossplane-596867cc4b-j4hxg
status:
  phase: Running
  conditions:
    - type: PodReadyToStartContainers
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2025-06-11T12:41:36Z'
    - type: Initialized
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2025-06-11T12:41:39Z'
    - type: Ready
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2025-06-11T12:41:42Z'
    - type: ContainersReady
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2025-06-11T12:41:42Z'
    - type: PodScheduled
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2025-06-11T12:41:33Z'
  hostIP: 10.14.1.25
  hostIPs:
    - ip: 10.14.1.25
  podIP: 100.64.24.240
  podIPs:
    - ip: 100.64.24.240
  startTime: '2025-06-11T12:41:33Z'
  initContainerStatuses:
    - name: crossplane-init
      state:
        terminated:
          exitCode: 0
          reason: Completed
          startedAt: '2025-06-11T12:41:36Z'
          finishedAt: '2025-06-11T12:41:39Z'
          containerID: >-
            containerd://e7ada0b6a75d1d84858a1b97ffde6239d2b2640cd4c51707429d6c322ef576d3
      lastState: {}
      ready: true
      restartCount: 0
      image: >-
        <account>.dkr.ecr.eu-west-1.amazonaws.com/bosun-image-cache/dockerhub/haarchri/crossplane:release-1.20
      imageID: >-
        <account>.dkr.ecr.eu-west-1.amazonaws.com/bosun-image-cache/dockerhub/haarchri/crossplane@sha256:050e28994f7189a41409050b72ed2d581a0fa00f33d29060869dd27eb5bd0ba3
      containerID: >-
        containerd://e7ada0b6a75d1d84858a1b97ffde6239d2b2640cd4c51707429d6c322ef576d3
      started: false
      volumeMounts:
        - name: kube-api-access-rx4kh
          mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          readOnly: true
          recursiveReadOnly: Disabled
        - name: aws-iam-token
          mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
          readOnly: true
          recursiveReadOnly: Disabled
  containerStatuses:
    - name: crossplane
      state:
        running:
          startedAt: '2025-06-11T12:41:40Z'
      lastState: {}
      ready: true
      restartCount: 0
      image: >-
        <account>.dkr.ecr.eu-west-1.amazonaws.com/bosun-image-cache/dockerhub/haarchri/crossplane:release-1.20
      imageID: >-
        <account>.dkr.ecr.eu-west-1.amazonaws.com/bosun-image-cache/dockerhub/haarchri/crossplane@sha256:050e28994f7189a41409050b72ed2d581a0fa00f33d29060869dd27eb5bd0ba3
      containerID: >-
        containerd://d508dc9905acbea435cecf63b3deee2aa2608f7ee3907929d1e03d1e0dbb3b28
      started: true
      volumeMounts:
        - name: package-cache
          mountPath: /cache/xpkg
        - name: function-cache
          mountPath: /cache/xfn
        - name: tls-server-certs
          mountPath: /tls/server
        - name: tls-client-certs
          mountPath: /tls/client
        - name: kube-api-access-rx4kh
          mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          readOnly: true
          recursiveReadOnly: Disabled
        - name: aws-iam-token
          mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
          readOnly: true
          recursiveReadOnly: Disabled
  qosClass: Burstable
spec:
  volumes:
    - name: aws-iam-token
      projected:
        sources:
          - serviceAccountToken:
              audience: sts.amazonaws.com
              expirationSeconds: 86400
              path: token
        defaultMode: 420
    - name: package-cache
      emptyDir:
        sizeLimit: 20Mi
    - name: function-cache
      emptyDir:
        sizeLimit: 512Mi
    - name: tls-server-certs
      secret:
        secretName: crossplane-tls-server
        defaultMode: 420
    - name: tls-client-certs
      secret:
        secretName: crossplane-tls-client
        defaultMode: 420
    - name: kube-api-access-rx4kh
      projected:
        sources:
          - serviceAccountToken:
              expirationSeconds: 3607
              path: token
          - configMap:
              name: kube-root-ca.crt
              items:
                - key: ca.crt
                  path: ca.crt
          - downwardAPI:
              items:
                - path: namespace
                  fieldRef:
                    apiVersion: v1
                    fieldPath: metadata.namespace
        defaultMode: 420
  initContainers:
    - name: crossplane-init
      image: >-
        <account>.dkr.ecr.eu-west-1.amazonaws.com/bosun-image-cache/dockerhub/haarchri/crossplane:release-1.20
      args:
        - core
        - init
      env:
        - name: GOMAXPROCS
          valueFrom:
            resourceFieldRef:
              containerName: crossplane-init
              resource: limits.cpu
              divisor: '1'
        - name: GOMEMLIMIT
          valueFrom:
            resourceFieldRef:
              containerName: crossplane-init
              resource: limits.memory
              divisor: '1'
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: POD_SERVICE_ACCOUNT
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.serviceAccountName
        - name: WEBHOOK_SERVICE_NAME
          value: crossplane-webhooks
        - name: WEBHOOK_SERVICE_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: WEBHOOK_SERVICE_PORT
          value: '9443'
        - name: TLS_CA_SECRET_NAME
          value: crossplane-root-ca
        - name: TLS_SERVER_SECRET_NAME
          value: crossplane-tls-server
        - name: TLS_CLIENT_SECRET_NAME
          value: crossplane-tls-client
        - name: AWS_STS_REGIONAL_ENDPOINTS
          value: regional
        - name: AWS_DEFAULT_REGION
          value: eu-west-1
        - name: AWS_REGION
          value: eu-west-1
        - name: AWS_ROLE_ARN
          value: arn:aws:iam::<account>:role/crossplane
        - name: AWS_WEB_IDENTITY_TOKEN_FILE
          value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
      resources:
        limits:
          cpu: '1'
          memory: 512Mi
        requests:
          cpu: 100m
          memory: 256Mi
      volumeMounts:
        - name: kube-api-access-rx4kh
          readOnly: true
          mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        - name: aws-iam-token
          readOnly: true
          mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      imagePullPolicy: IfNotPresent
      securityContext:
        runAsUser: 65532
        runAsGroup: 65532
        readOnlyRootFilesystem: true
        allowPrivilegeEscalation: false
  containers:
    - name: crossplane
      image: >-
        <account>.dkr.ecr.eu-west-1.amazonaws.com/bosun-image-cache/dockerhub/haarchri/crossplane:release-1.20
      args:
        - core
        - start
        - '--debug'
      ports:
        - name: readyz
          containerPort: 8081
          protocol: TCP
        - name: metrics
          containerPort: 8080
          protocol: TCP
        - name: webhooks
          containerPort: 9443
          protocol: TCP
      env:
        - name: GOMAXPROCS
          valueFrom:
            resourceFieldRef:
              containerName: crossplane
              resource: limits.cpu
              divisor: '1'
        - name: GOMEMLIMIT
          valueFrom:
            resourceFieldRef:
              containerName: crossplane
              resource: limits.memory
              divisor: '1'
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: POD_SERVICE_ACCOUNT
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.serviceAccountName
        - name: LEADER_ELECTION
          value: 'true'
        - name: TLS_SERVER_SECRET_NAME
          value: crossplane-tls-server
        - name: TLS_SERVER_CERTS_DIR
          value: /tls/server
        - name: TLS_CLIENT_SECRET_NAME
          value: crossplane-tls-client
        - name: TLS_CLIENT_CERTS_DIR
          value: /tls/client
        - name: NEW_RELIC_METADATA_KUBERNETES_CLUSTER_NAME
          value: lab-ie-core
        - name: NEW_RELIC_METADATA_KUBERNETES_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: NEW_RELIC_METADATA_KUBERNETES_NAMESPACE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: NEW_RELIC_METADATA_KUBERNETES_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: NEW_RELIC_METADATA_KUBERNETES_CONTAINER_NAME
          value: crossplane
        - name: NEW_RELIC_METADATA_KUBERNETES_CONTAINER_IMAGE_NAME
          value: >-
            <account>.dkr.ecr.eu-west-1.amazonaws.com/bosun-image-cache/dockerhub/haarchri/crossplane:release-1.20
        - name: NEW_RELIC_METADATA_KUBERNETES_DEPLOYMENT_NAME
          value: crossplane
        - name: AWS_STS_REGIONAL_ENDPOINTS
          value: regional
        - name: AWS_DEFAULT_REGION
          value: eu-west-1
        - name: AWS_REGION
          value: eu-west-1
        - name: AWS_ROLE_ARN
          value: arn:aws:iam::<account>:role/crossplane
        - name: AWS_WEB_IDENTITY_TOKEN_FILE
          value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
      resources:
        limits:
          cpu: '1'
          memory: 512Mi
        requests:
          cpu: 100m
          memory: 256Mi
      volumeMounts:
        - name: package-cache
          mountPath: /cache/xpkg
        - name: function-cache
          mountPath: /cache/xfn
        - name: tls-server-certs
          mountPath: /tls/server
        - name: tls-client-certs
          mountPath: /tls/client
        - name: kube-api-access-rx4kh
          readOnly: true
          mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        - name: aws-iam-token
          readOnly: true
          mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
      startupProbe:
        tcpSocket:
          port: readyz
        timeoutSeconds: 1
        periodSeconds: 2
        successThreshold: 1
        failureThreshold: 30
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      imagePullPolicy: IfNotPresent
      securityContext:
        runAsUser: 65532
        runAsGroup: 65532
        readOnlyRootFilesystem: true
        allowPrivilegeEscalation: false
  restartPolicy: Always
  terminationGracePeriodSeconds: 30
  dnsPolicy: ClusterFirst
  serviceAccountName: crossplane
  serviceAccount: crossplane
  nodeName: ip-10-14-1-25.eu-west-1.compute.internal
  securityContext: {}
  schedulerName: default-scheduler
  tolerations:
    - key: node.kubernetes.io/not-ready
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
    - key: node.kubernetes.io/unreachable
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
  priorityClassName: default
  priority: 0
  enableServiceLinks: true
  preemptionPolicy: PreemptLowerPriority

dhumphries-sainsburys avatar Jun 12 '25 12:06 dhumphries-sainsburys

any chance you can mirror for a test the xpkg image with crane to ecr and not using pull through cache ? - i wonder if we have an issue with the pull through cache in ecr

and wich EKS Version you running ? And do you have IMDSv2 set to required ??

haarchri avatar Jun 12 '25 13:06 haarchri

i will give it a try in my environment with a pull through cache ecr repo

haarchri avatar Jun 12 '25 13:06 haarchri

any chance you can mirror for a test the xpkg image with crane to ecr and not using pull through cache ? - i wonder if we have an issue with the pull through cache in ecr

Already tried separately but same result. I had the same thought so have also tried it with ECR completely open with the same result

and wich EKS Version you running ? And do you have IMDSv2 set to required ??

1.32 currently and yes

dhumphries-sainsburys avatar Jun 12 '25 14:06 dhumphries-sainsburys

@dhumphries-sainsburys any chance you can try this image ? - my idea atm is around the IMDSv2

docker.io/haarchri/crossplane:release-1.20-imdsv2

haarchri avatar Jun 12 '25 15:06 haarchri

@haarchri - Got some new errors but probably not what you are looking for

2025-06-12T16:09:43Z	DEBUG	crossplane	Event(v1.ObjectReference{Kind:"ProviderRevision", Namespace:"", Name:"provider-aws-iam-78efa8cce6df", UID:"83010cd7-4555-4272-9a13-382f264b9f1e", APIVersion:"pkg.crossplane.io/v1", ResourceVersion:"661383948", FieldPath:""}): type: 'Warning' reason: 'LintPackage' incompatible Crossplane version: package is not compatible with Crossplane version (release-1.20-imdsv2): Invalid Semantic Version
2025-06-12T16:09:43Z	DEBUG	crossplane	Event(v1.ObjectReference{Kind:"ProviderRevision", Namespace:"", Name:"upbound-provider-family-aws-20202ddbd61a", UID:"985492c9-0920-4898-a442-74b9f6199860", APIVersion:"pkg.crossplane.io/v1", ResourceVersion:"661383951", FieldPath:""}): type: 'Warning' reason: 'LintPackage' incompatible Crossplane version: package is not compatible with Crossplane version (release-1.20-imdsv2): Invalid Semantic Version
2025-06-12T16:09:43Z	DEBUG	crossplane	Reconciling	{"controller": "packages/providerrevision.pkg.crossplane.io", "request": {"name":"upbound-provider-family-aws-20202ddbd61a"}}
2025-06-12T16:09:43Z	DEBUG	crossplane	Event(v1.ObjectReference{Kind:"ProviderRevision", Namespace:"", Name:"provider-aws-ecr-a4cd84c89c2f", UID:"ffd31b7e-7c15-403f-80b7-491d6a0a194c", APIVersion:"pkg.crossplane.io/v1", ResourceVersion:"661383955", FieldPath:""}): type: 'Warning' reason: 'LintPackage' incompatible Crossplane version: package is not compatible with Crossplane version (release-1.20-imdsv2): Invalid Semantic Version
2025-06-12T16:09:43Z	DEBUG	crossplane	Event(v1.ObjectReference{Kind:"ProviderRevision", Namespace:"", Name:"provider-aws-s3-bb85bd9605c3", UID:"72c5a003-ab00-4bf5-b0b8-ed601765f645", APIVersion:"pkg.crossplane.io/v1", ResourceVersion:"661383944", FieldPath:""}): type: 'Warning' reason: 'LintPackage' incompatible Crossplane version: package is not compatible with Crossplane version (release-1.20-imdsv2): Invalid Semantic Version
2025-06-12T16:09:43Z	DEBUG	crossplane	Event(v1.ObjectReference{Kind:"ProviderRevision", Namespace:"", Name:"upbound-provider-family-aws-20202ddbd61a", UID:"985492c9-0920-4898-a442-74b9f6199860", APIVersion:"pkg.crossplane.io/v1", ResourceVersion:"661383951", FieldPath:""}): type: 'Warning' reason: 'LintPackage' incompatible Crossplane version: package is not compatible with Crossplane version (release-1.20-imdsv2): Invalid Semantic Version
2025-06-12T16:09:44Z	DEBUG	crossplane	Event(v1.ObjectReference{Kind:"ProviderRevision", Namespace:"", Name:"provider-aws-dynamodb-74a4c867079f", UID:"6275c4e4-2478-4581-ac64-f2e492b5bb14", APIVersion:"pkg.crossplane.io/v1", ResourceVersion:"661383952", FieldPath:""}): type: 'Warning' reason: 'LintPackage' incompatible Crossplane version: package is not compatible with Crossplane version (release-1.20-imdsv2): Invalid Semantic Version

Otherwise same errors

dhumphries-sainsburys avatar Jun 12 '25 16:06 dhumphries-sainsburys

Just as another wild test, I have created a basic pod using an image in this ECR in the same namespace

apiVersion: v1
kind: Pod
metadata:
  name: test
  namespace: crossplane-system
spec:
  containers:
  - name: test
    image: <account>.dkr.ecr.eu-west-1.amazonaws.com/bosun-image-cache/ghcr/crossplane-contrib/provider-aws-dynamodb:v1.22.0

This works fine and pulls no problem as do other images following this approach. Now this is using a different serviceaccount but it is one with no permissions at all and that is working fine

dhumphries-sainsburys avatar Jun 12 '25 16:06 dhumphries-sainsburys

can you show me the manifest you apply ? can you skip the dependency and version constraint ?- because of my debug version i builded for you ? otherwise we don't get a pull

https://doc.crds.dev/github.com/crossplane/crossplane/pkg.crossplane.io/Provider/[email protected]#spec-ignoreCrossplaneConstraints https://doc.crds.dev/github.com/crossplane/crossplane/pkg.crossplane.io/Provider/[email protected]#spec-skipDependencyResolution

haarchri avatar Jun 12 '25 17:06 haarchri

the issue we have atm is that crossplane need to first pull the package to get the package.yaml

haarchri avatar Jun 12 '25 17:06 haarchri

Not sure which manifest you are looking for but here is an example provider one

apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
  creationTimestamp: '2025-05-13T11:19:38Z'
  generation: 14
  labels:
    k8slens-edit-resource-version: v1
    kustomize.toolkit.fluxcd.io/name: bosun-stack-crossplane-providers
    kustomize.toolkit.fluxcd.io/namespace: flux-system
  name: upbound-provider-family-aws
  resourceVersion: '662367386'
  uid: ee3f08d6-83f1-4882-970b-a073e47d6f78
  selfLink: /apis/pkg.crossplane.io/v1/providers/upbound-provider-family-aws
status:
  conditions:
    - lastTransitionTime: '2025-06-05T14:22:55Z'
      reason: HealthyPackageRevision
      status: 'True'
      type: Healthy
    - lastTransitionTime: '2025-06-11T08:27:58Z'
      message: >
        cannot unpack package: failed to fetch package digest from remote:
        failed to fetch package descriptor with a GET request after a previous
        HEAD request failure: HEAD
        https://<account>.dkr.ecr.eu-west-1.amazonaws.com/v2/bosun-image-cache/ghcr/crossplane-contrib/provider-family-aws/manifests/v1.21.0:
        unexpected status code 401 Unauthorized (HEAD responses have no body,
        use GET for details): GET
        https://<account>.dkr.ecr.eu-west-1.amazonaws.com/v2/bosun-image-cache/ghcr/crossplane-contrib/provider-family-aws/manifests/v1.21.0:
        unexpected status code 401 Unauthorized: Not Authorized
      reason: UnpackingPackage
      status: 'False'
      type: Installed
  currentIdentifier: xpkg.upbound.io/upbound/provider-family-aws:v1.21.0
  currentRevision: upbound-provider-family-aws-20202ddbd61a
  resolvedPackage: >-
    <account>.dkr.ecr.eu-west-1.amazonaws.com/bosun-image-cache/ghcr/crossplane-contrib/provider-family-aws:v1.21.0
spec:
  ignoreCrossplaneConstraints: true
  package: >-
    <account>.dkr.ecr.eu-west-1.amazonaws.com/bosun-image-cache/ghcr/crossplane-contrib/provider-family-aws:v1.21.0
  packagePullPolicy: IfNotPresent
  revisionActivationPolicy: Automatic
  revisionHistoryLimit: 2
  runtimeConfigRef:
    apiVersion: pkg.crossplane.io/v1beta1
    kind: DeploymentRuntimeConfig
    name: upbound-irsa
  skipDependencyResolution: true

This shows it with both settings set as requested but no change (they were previously false)

dhumphries-sainsburys avatar Jun 13 '25 08:06 dhumphries-sainsburys