bottlerocket icon indicating copy to clipboard operation
bottlerocket copied to clipboard

Bottlerocket ignoring settings.container-registry.mirrors

Open maxtacu opened this issue 1 year ago • 9 comments

It seems like bottlerocket is ignoring settings.container-registry.mirrors We have configured Pull through cache in aws ECR and set the settings.container-registry.mirrors for quay.io, docker.io, ghcr.io and registry.k8s.io to use ECR Pull through cache, but bottlerocket instance is still pulling directly from the upstream. I can clearly see that mirrors were set using apiclient -u /settings on the instance. It is not even trying to pull from ECR pull-through cache. Our current setting is smth like this:

  [[settings.container-registry.mirrors]]
  "registry" = "registry.k8s.io"
  "endpoint" = ["<account_id>.dkr.ecr.us-east-1.amazonaws.com/k8s"]
  [[settings.container-registry.mirrors]]
  "registry" = "quay.io"
  "endpoint" = ["<account_id>.dkr.ecr.us-east-1.amazonaws.com/quay"]
  [[settings.container-registry.mirrors]]
  "registry" = "docker.io"
  "endpoint" = ["<account_id>.dkr.ecr.us-east-1.amazonaws.com/dockerhub"]
  [[settings.container-registry.mirrors]]
  "registry" = "ghcr.io"
  "endpoint" = ["<account_id>.dkr.ecr.us-east-1.amazonaws.com/github"] 

Image I'm using: bottlerocket-aws-k8s-1.27-x86_64-v1.16.1

What I expected to happen: Pull through ECR cache for images in quay.io, docker.io, ghcr.io and registry.k8s.io registires.

What actually happened: Pulling directly from upstream. Not even trying to pull from the ECR cache

How to reproduce the problem:

  • Configure "Pull through cache" in ECR
  • Set settings.container-registry.mirrors in your bottlerocket instance using UserData.
  • Check that the setting is set by connecting to the instance either by ssh or Session Manager and run apiclient -u /settings
  • Let your node in EKS pull any image from quay.io, docker.io, ghcr.io or registry.k8s.io registries.
  • See containerd logs or k8s events for the deployed application

maxtacu avatar Nov 30 '23 13:11 maxtacu

Thank's for reaching out. We are looking into this issue.

ecpullen avatar Nov 30 '23 19:11 ecpullen

Hi, the configuration specified in settings.container-registery gets passed to docker/containerd/kubelet's configuration for configuring registry mirrors and credentials as is. It's likely that you would need to specify creds to be able to talk to the private ECR repositories. For getting the credentials, you can follow instructions here: https://docs.aws.amazon.com/AmazonECR/latest/userguide/registry_auth.html and specify them via settings.container-registery.credentials.

Do note that there is a potential security concern for docker-based variants like aws-ecs-* when specifying auth for your mirror outline here: https://github.com/moby/moby/issues/30880#issuecomment-798807332 where your credentials might end up getting send to the destination registry if your mirror is not responding as the image resolver tries all possible endpoints.

etungsten avatar Dec 01 '23 19:12 etungsten

@etungsten isn't it enough having a IAM role for the instance to access private ECR? If I specify on the instance to pull an image from the ECR it will pull it, but 'mirroring' is requiring credentials? why so?

maxtacu avatar Dec 03 '23 00:12 maxtacu

Hi @maxtacu, in the case of using a private ECR image directly with K8s pods, kubelet can get the ECR credentials from the AWS cloud provider (specifically the ECR credentials helper). However, that code path does not get triggered if the destination image URL is not ECR, and you set ECR as a private mirror instead. kubelet would only see that you're trying to pull from quay.io/docker.io/ghcr.io. In this case, after kubelet tells cri-containerd to pull the image, cri-containerd will need auth information for talking to the registry endpoints that you set as mirror.

For more details, you can check out this containerd issue: https://github.com/containerd/containerd/issues/6637

I think for your use-case it might be easier to try setting up private ECR as pull through caches as detailed here: https://docs.aws.amazon.com/AmazonECR/latest/userguide/pull-through-cache.html. You then wouldn't need to set private ECR as a registry mirror, but instead would need to set private ECR as the actual image URLs for your pods. kubelet would then be able to get the ECR credentials through the AWS cloud provider. Though I understand it might be troublesome to modify your K8s deployments to replace all of the images URIs you're trying to mirror/cache.

etungsten avatar Dec 03 '23 03:12 etungsten

weird, but it looks like it randomly started today trying to pull through ECR mirror and now we have the same issue as described here https://github.com/bottlerocket-os/bottlerocket/issues/2427 I will test with credentials later. But it seems that it is an issue with bottlerocket (or containerd itself) to be able to properly use IAM roles instead of credentials only

maxtacu avatar Dec 04 '23 12:12 maxtacu

has anyone found a workaround for this yet?

tuananh avatar Apr 04 '24 10:04 tuananh

Kyverno or another mutation webhook

svyatoslavmo avatar Apr 04 '24 10:04 svyatoslavmo

@tuananh within the bottlerocket I didnt find any workaround so eventually I created a mutating webhook. Repository also includes kyverno policies in case you want to do it only via kyverno

maxtacu avatar Apr 04 '24 16:04 maxtacu

@tuananh within the bottlerocket I didnt find any workaround so eventually I created a mutating webhook. Repository also includes kyverno policies in case you want to do it only via kyverno

edit: looks like this is how it works

api => kubelet => check registry , see if we need to run generate creds via CredentialProviderConfig - https://kubernetes.io/docs/tasks/administer-cluster/kubelet-credential-provider/ - if yes => pass them down to containerd - in this way, kubelet sees it as quay.io => no credentials => quay.io is passed over to containerd. containerd see mirror settings => change it but ECR pullthrough but now no credentials - if no ...

tuananh avatar Apr 05 '24 02:04 tuananh