AKS icon indicating copy to clipboard operation
AKS copied to clipboard

Provide a configuration to define a registry mirror on AKS worker nodes

Open nmeisenzahl opened this issue 4 years ago • 42 comments

With the new Docker Hub rate-limit in place, it is a best practice to run/use a registry mirror.

I think this is something that needs to be defined on containerd and is therefore abstracted. It would be good to have a configuration parameter to optionally provide a registry mirror URL that then gets added to all nodes.

nmeisenzahl avatar Nov 03 '20 14:11 nmeisenzahl

Hi nmeisenzahl, AKS bot here :wave: Thank you for posting on the AKS Repo, I'll do my best to get a kind human from the AKS team to assist you.

I might be just a bot, but I'm told my suggestions are normally quite good, as such:

  1. If this case is urgent, please open a Support Request so that our 24/7 support team may help you faster.
  2. Please abide by the AKS repo Guidelines and Code of Conduct.
  3. If you're having an issue, could it be described on the AKS Troubleshooting guides or AKS Diagnostics?
  4. Make sure your subscribed to the AKS Release Notes to keep up to date with all that's new on AKS.
  5. Make sure there isn't a duplicate of this issue already reported. If there is, feel free to close this one and '+1' the existing issue.
  6. If you have a question, do take a look at our AKS FAQ. We place the most common ones there!

ghost avatar Nov 03 '20 14:11 ghost

Maybe something like this can be used until it's supported?

https://github.com/patnaikshekhar/AKSNodeInstaller

simongottschlag avatar Nov 03 '20 22:11 simongottschlag

Triage required from @Azure/aks-pm

ghost avatar Nov 06 '20 00:11 ghost

Action required from @Azure/aks-pm

ghost avatar Nov 11 '20 01:11 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Nov 26 '20 06:11 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Dec 11 '20 12:12 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Dec 26 '20 18:12 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Jan 11 '21 00:01 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Jan 26 '21 06:01 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Feb 10 '21 06:02 ghost

@nmeisenzahl: You want to provide a 3rd party mirror during cluster create?

TomGeske avatar Feb 10 '21 13:02 TomGeske

@TomGeske Yes. To minimize external dependencies (Docker Hub rate-limiting, Downtime of external registries) as well as faster pull-times. Retagging external images sometimes is an issue due to missing upgrade strategies and processes.

That said, a managed mirror (provided by Azure, maybe ACR) would be the best solution.

nmeisenzahl avatar Feb 10 '21 13:02 nmeisenzahl

@SteveLasker would you be able to assist?

Issue Details

With the new Docker Hub rate-limit in place, it is a best practice to run/use a registry mirror.

I think this is something that needs to be defined on containerd and is therefore abstracted. It would be good to have a configuration parameter to optionally provide a registry mirror URL that then gets added to all nodes.

Author: nmeisenzahl
Assignees: -
Labels:

azure/acr, feature-request

Milestone: -

ghost avatar Feb 10 '21 13:02 ghost

Maybe @SteveLasker has some thoughts on this.

TomGeske avatar Feb 10 '21 13:02 TomGeske

Is there any progress/planning with this issue?

This seems very similar to aks-engine #3961 which also requests to make the ability of containerd to configure mirrors available via configuration I would also request to add private ca certificates for those mirrors. With docker on earlier AKS versions some people have used DaemonSets to add CA certificates to the docker daemon. I guess this won't work anymore.

mayrstefan avatar Jun 20 '21 11:06 mayrstefan

Containerd backend support on Linux for this was implemented in 1.5, which shipped with AKS 1.22. I have implemented the configuration options to allow for configuring registry settings and CA certs in the Azure/AgentBaker#1369 PR, which should merge soon. Because of the holiday, it won't hit AKS until the first release next year, which should hit production probably around the end of January 2022. Once the release in January is created, this option will be applied when nodes are re-created on a nodepool upgrade or node-image upgrade as long as the cluster is running at least version 1.22. Check out Containerd registry configuration for information on what you can put in hosts.toml and how to use it.

You copy your registry configuration files into place using something like the following DaemonSet and ConfigMap. If you want to configure multiple registries, add more ConfigMaps (one for each registry) and mount them to /src/certs.d/registryhostname. Note that if you use a port in your image pull (like myregistry.mydomain.com:5000) you must put that port in the directory name. Also note that if you change the configmaps you'll need to restart the DaemonSet pods with kubectl rollout restart -n kube-system daemonset/containerd-registry-config-daemon.

apiVersion: v1
data:
  docker-mirror.crt: |
    -----BEGIN CERTIFICATE-----
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    -----END CERTIFICATE-----
  hosts.toml: |
    [host."https://docker-mirror.internal"]
      capabilities = ["pull", "resolve"]
      ca = "docker-mirror.crt"
kind: ConfigMap
metadata:
  name: containerd-dockerio-mirror
  namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: containerd-registry-config-daemon
  namespace: kube-system
  labels:
    app: containerd-registry-config
spec:
  selector:
    matchLabels:
      app: containerd-registry-config
  template:
    metadata:
      labels:
        app: containerd-registry-config
    spec:
      nodeSelector:
        kubernetes.io/os: linux
      tolerations:
        - key: node-role.kubernetes.io/master
          operator: Exists
          effect: NoSchedule
      containers:
        - name: copy-files
          image: busybox
          command:
            - /bin/sh
            - -c
            - cd /src;
              find */* -type d ! -name '..*' -exec mkdir -pv '/etc/containerd/{}' \;;
              find */* -type l ! -name '..*' -exec cp -v '{}' '/etc/containerd/{}' \;;
              echo "Done copying files.";
              sleep infinity
          volumeMounts:
            - name: hostcontainerd
              mountPath: /etc/containerd
            - name: dockerio
              mountPath: /src/certs.d/docker.io
      terminationGracePeriodSeconds: 0
      volumes:
        - name: hostcontainerd
          hostPath:
            path: /etc/containerd
            type: Directory
        - name: dockerio
          configMap:
            name: containerd-dockerio-mirror

phealy avatar Dec 16 '21 14:12 phealy

reopening this issue until the change actually rolls to prod - it auto linked and I didn't realize it.

phealy avatar Jan 03 '22 17:01 phealy

We're thinking about ways to make this more configurable, but for now the containerd change is available for new-build clusters on at least version 1.22 in eastus, with further regions to follow in the next week or so.

phealy avatar Jan 16 '22 16:01 phealy

@phealy Does AgentBaker support windows hosts? Will the above, with the right path/image changes, work for windows hosts as well?

giskou avatar Jan 17 '22 15:01 giskou

Not quite yet - but we are working on that!

phealy avatar Jan 17 '22 15:01 phealy

Containerd backend support on Linux for this was implemented in 1.5, which shipped with AKS 1.22. I have implemented the configuration options to allow for configuring registry settings and CA certs in the Azure/AgentBaker#1369 PR, which should merge soon. Because of the holiday, it won't hit AKS until the first release next year, which should hit production probably around the end of January 2022. Once the release in January is created, this option will be applied when nodes are re-created on a nodepool upgrade or node-image upgrade as long as the cluster is running at least version 1.22. Check out Containerd registry configuration for information on what you can put in hosts.toml and how to use it.

You copy your registry configuration files into place using something like the following DaemonSet and ConfigMap. If you want to configure multiple registries, add more ConfigMaps (one for each registry) and mount them to /src/certs.d/registryhostname. Note that if you use a port in your image pull (like myregistry.mydomain.com:5000) you must put that port in the directory name. Also note that if you change the configmaps you'll need to restart the DaemonSet pods with kubectl rollout restart -n kube-system daemonset/containerd-registry-config-daemon.

apiVersion: v1
data:
  docker-mirror.crt: |
    -----BEGIN CERTIFICATE-----
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    -----END CERTIFICATE-----
  hosts.toml: |
    [host."https://docker-mirror.internal"]
      capabilities = ["pull", "resolve"]
      ca = "docker-mirror.crt"
kind: ConfigMap
metadata:
  name: containerd-dockerio-mirror
  namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: containerd-registry-config-daemon
  namespace: kube-system
  labels:
    app: containerd-registry-config
spec:
  selector:
    matchLabels:
      app: containerd-registry-config
  template:
    metadata:
      labels:
        app: containerd-registry-config
    spec:
      nodeSelector:
        kubernetes.io/os: linux
      tolerations:
        - key: node-role.kubernetes.io/master
          operator: Exists
          effect: NoSchedule
      containers:
        - name: copy-files
          image: busybox
          command:
            - /bin/sh
            - -c
            - cd /src;
              find */* -type d ! -name '..*' -exec mkdir -pv '/etc/containerd/{}' \;;
              find */* -type l ! -name '..*' -exec cp -v '{}' '/etc/containerd/{}' \;;
              echo "Done copying files.";
              sleep infinity
          volumeMounts:
            - name: hostcontainerd
              mountPath: /etc/containerd
            - name: dockerio
              mountPath: /src/certs.d/docker.io
      terminationGracePeriodSeconds: 0
      volumes:
        - name: hostcontainerd
          hostPath:
            path: /etc/containerd
            type: Directory
        - name: dockerio
          configMap:
            name: containerd-dockerio-mirror

This works great, the only thing I suggest updating is adding toleration for everything so this runs on specific worker nodes from nodepools that have taints added. `

    # tolerate everything so all nodepool taints are tolerated.
    - key: ""
      operator: Exists

`

srikiz avatar Feb 02 '22 06:02 srikiz

Are there any plans to support registries with authentication? It seems the config for containerd in the supported directory (/etc/containerd/certs.d) does not support setting the CRI registry credentials.

This seems to require a daemon reload.

braunsonm avatar Feb 03 '22 19:02 braunsonm

Action required from @Azure/aks-pm

ghost avatar Aug 08 '22 01:08 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Aug 23 '22 06:08 ghost

Issue needing attention of @Azure/aks-leads

ghost avatar Sep 07 '22 12:09 ghost

does not support setting the CRI registry credentials.

have you seen https://github.com/containerd/containerd/blob/main/docs/hosts.md#client-field? or similarly https://github.com/containerd/containerd/blob/main/docs/hosts.md#support-for-dockers-certificate-file-pattern ?

This seems to require a daemon reload.

that is unfortunately necessary right now, would be interesting to see if we can get containerd upstream to automatically reload those...

alexeldeib avatar Sep 09 '22 02:09 alexeldeib

@alexeldeib That's for client certificates not auth tokens, identity tokens or username/passwords. You can get away with bas64 encrypted username/password with some registries but not all. Registries implementing OIDC authentication and tokens limited to the scope of the image being pulled (docker hub does this in most cases) would need clients to do this negotiation/token exchange. Containerd can do this from what I understand but only under the old config at the time of writing my original comment.

braunsonm avatar Sep 09 '22 03:09 braunsonm

hmm interesting point. does the mirror config + imagePullSecrets at k8s layer work for your use case?

at a glance, it seems like the CRI config still works but is deprecated and the only supported way to pass creds for image pull/push to containerd is via CRI API? @cpuguy83 is that accurate?

I'd rather not add support for deprecated config, but not sure I totally follow what the alternative is // what your use case is if pull secrets or similar doesn't work

alexeldeib avatar Sep 09 '22 03:09 alexeldeib

Yea exactly @alexeldeib

That doesn't work in my use case as I'm trying to automatically have my AKS cluster authenticate with docker hub for all pulls to avoid rate limiting.

braunsonm avatar Sep 09 '22 11:09 braunsonm

The deprecated config is the only other way to do this right now. This should remain in place at least until 1.7 and would be removed after that in 2.0.

The only other approach at the moment is to use a registry cache which has the creds already.

cpuguy83 avatar Sep 09 '22 15:09 cpuguy83