etcd icon indicating copy to clipboard operation
etcd copied to clipboard

Unify the approach to manage etcd images in Kubernetes

Open ahrtr opened this issue 8 months ago • 34 comments

What would you like to be added?

The etcd container images are currently built from cluster/images/etcd, pushed to registry.k8s.io/etcd, and used by kubeadm, KOps and Kubernetes workflow tests.

Concerns & comments,

  • It would be good to consider moving the image build definition out of the Kubernetes repo
  • Why not to use etcd officially released images?

Proposals,

  • Push etcd officially released images to registry.k8s.io/etcd
  • Deprecate the migration tool

See also https://docs.google.com/document/d/1B0C391PJ2zwHnmIwOWpzq-tjzj9-Gci-Ph2k9mwC5R4/edit?tab=t.0#heading=h.xblvj5c0ffhf

Please feel free to comment on the google doc, thx

cc @ivanvc @jmhbnz @serathius @siyuanfoundation @wenjiaswe @neolit123 @liggitt @BenTheElder @hakman

Why is this needed?

to ensure a smooth switch to etcd offically released images

ahrtr avatar Apr 25 '25 15:04 ahrtr

I'd love to take part since I was involved with building and updating etcd images for K8s.

joshjms avatar May 29 '25 18:05 joshjms

thx @joshjms . I believe you need work together with @ivanvc & @jmhbnz (since we also need to change our release script to push image to k/k registry) and @siyuanfoundation (who should be have more knowledge on the k/k etcd image build stuff in k/k repo (also under @serathius 's guide/help)

As mentioned in the community meeting, high level I think there are two big steps:

  • Step 1: persuade & help all users (see below) to migrate to etcd officially released image. We also need to update our release script to push etcd image to k/k registry
    • Kubeadm cc @neolit123
    • KOps. cc @hakman
    • K/k workflow cc ?
  • Step 2: deprecate the etcd build stuff in k/k repo and eventually remove it 1-2 years later

ahrtr avatar May 29 '25 18:05 ahrtr

Looking forward to it

joshjms avatar May 29 '25 18:05 joshjms

I think we have two choice,

  • We just push etcd officially released image to registry.k8s.io/etcd
  • Request all users to use etcd officially released image directly. For example KOps, I don't see any blocker to do this. @hakman . For other users, there may be some push back.

@joshjms I suggest to raise an issue for each user respectively to request them make change to use etcd officially released image to get their feedback.

  • https://github.com/kubernetes/kops
  • https://github.com/kubernetes/kubeadm
  • For Kubernetes workflow, probably we can just raise an issue in https://github.com/kubernetes/kubernetes?

ahrtr avatar Jul 10 '25 19:07 ahrtr

Understood, will do

joshjms avatar Jul 10 '25 19:07 joshjms

@ahrtr @joshjms I started a test just to see if switching to the official images works for kOps (https://github.com/kubernetes/kops/pull/17485). I would still prefer images to be pushed to registry.k8s.io/etcd due to the benefits of registry.k8s.io.

hakman avatar Jul 11 '25 05:07 hakman

I would still prefer images to be pushed to registry.k8s.io/etcd due to the benefits of registry.k8s.io.

I agree, I think we will move towards this as discussed in the docs. We would need to update our release script for future releases (cc @ivanvc @jmhbnz) and we can push our past releases to help ease integration.

joshjms avatar Jul 11 '25 05:07 joshjms

The test from https://github.com/kubernetes/kops/pull/17485 worked fine. Switching to gcr.io/etcd-development/etcd:v3.5.21 had no impact on e2e tests. So, as long as the official images would be published to registry.k8s.io, the change should be trivial.

hakman avatar Jul 11 '25 07:07 hakman

I'll summarize the progress here.

According to the docs in k8s.io, access to the staging repo for etcd is given to the group defined under k8s-infra-staging-etcd. You can see the group definition here. Access to the staging repository is given to me and etcd leads.

Then, I pushed our images as is to the staging repository (that is without changing anything - including the tag, hence the image digests are the same as the ones in etcd's releases, this is to commit to using actual etcd releases).

After these images are in staging, to be able to use them for KOps, kubeadm, etc, they have to be promoted. The procedure for promoting is adding them to the yaml file. This is done in this PR.

I think the next steps is to create a PR in k/k to test one image (ideally the latest image) to see if they can pass the CI tests seamlessly.

@ivanvc @ahrtr Thoughts?

joshjms avatar Aug 10 '25 14:08 joshjms

@joshjms, why did we decide to drop the -0 suffix from the image tag? I don't know when or where we had that conversation.

ivanvc avatar Aug 15 '25 17:08 ivanvc

We discussed this in our community meeting on August 21st, 2025. We agreed to push our etcd:v<version> image as registry.k8s.io/etcd:<version>-r0. As a minimal step to achieve compatibility and push the K8s etcd release image from our release script.

ivanvc avatar Aug 22 '25 05:08 ivanvc

i commented similar on the related pr that updates etcd to latest in k/k. from what i've seen, users of custom local registries that mirror registry.k8s.io have baked scripts with assumptions about the paths and version format of any image including etcd. last time when coredns made a tweak of their subpath it broke a number of users.

while kubeadm will work fine with this change similarly to kOps, these custom registry users might have complains, and that is why i suggested to avoid the change unless necessary.

if the change is considered necessary, i guess we will have to direct complains to sig-etcd.

@SataQiu @pacoxu @carlory @HirazawaUi (cc kubeadm OWNERS)

neolit123 avatar Aug 22 '25 07:08 neolit123

I suppose we will only push the multiarch images into the registry as etcd:{version}-0. I'm not sure how to retag etcd:v{version}-{arch} into the format used in k/k.

joshjms avatar Sep 14 '25 14:09 joshjms

@joshjms I don’t think the ‘arch’ ones are important. They should not be pushed at all going forward. The ‘-0’ suffix is important going forward, for newly released images. Older images are already in the registry. Is this correct @neolit123 ?

hakman avatar Sep 14 '25 14:09 hakman

#!/usr/bin/bash

mapfile -t image_tags < <(crane ls gcr.io/etcd-development/etcd)

for tag in "${image_tags[@]}"; do
    if [[ ! "${tag}" =~ ^v || "${tag}" =~ -(amd64|arm64|ppc64le|s390x)$ ]]; then
        echo "skipping ${tag}"
        continue
    fi
    echo "pushing image etcd:${tag}"

    new_tag="${tag#v}-0"
    echo "new tag: ${new_tag}"
    crane copy gcr.io/etcd-development/etcd:"${tag}" gcr.io/k8s-staging-etcd/etcd:"${new_tag}"
done

So using this script, we can retag the images using the k/k format and push to k8s-staging-etcd. @ivanvc @hakman @neolit123

joshjms avatar Sep 14 '25 15:09 joshjms

#!/usr/bin/bash

mapfile -t image_tags < <(crane ls gcr.io/etcd-development/etcd)

for tag in "${image_tags[@]}"; do
    if [[ ! "${tag}" =~ ^v || "${tag}" =~ -(amd64|arm64|ppc64le|s390x)$ ]]; then
        echo "skipping ${tag}"
        continue
    fi
    echo "pushing image etcd:${tag}"

    new_tag="${tag#v}-0"
    echo "new tag: ${new_tag}"
    crane copy gcr.io/etcd-development/etcd:"${tag}" gcr.io/k8s-staging-etcd/etcd:"${new_tag}"
done

So using this script, we can retag the images using the k/k format and push to k8s-staging-etcd. @ivanvc @hakman @neolit123

Only missing 3.5 and 3.6?

hakman avatar Sep 14 '25 15:09 hakman

i don't know if existing registry.k8s.io/etcd images are multiarch/os. if yes, then that has to continue being the case. if not, making them multiarch/os should be backwards compatible.

neolit123 avatar Sep 14 '25 16:09 neolit123

I suppose we will only push the multiarch images into the registry as etcd:{version}-0. I'm not sure how to retag etcd:v{version}-{arch} into the format used in k/k.

you could ask in sig-k8s-infra on k8s slack, but afaik you have to promote again and there is no retag.

neolit123 avatar Sep 14 '25 17:09 neolit123

I pushed the {version}-0 images into staging. Will send a PR to promote them soon.

joshjms avatar Sep 19 '25 03:09 joshjms

Now that the image used in k/k is the etcd's official images (soon https://github.com/kubernetes/kubernetes/pull/134251 🥳 (thanks everyone)), we can remove the image building code in k/k after some (maybe 2 patch) versions with no issues.

joshjms avatar Oct 03 '25 10:10 joshjms

After that PR is merged, should I update the docs on updating etcd version in k/k? @ahrtr

joshjms avatar Oct 03 '25 10:10 joshjms

After that PR is merged, should I update the docs on updating etcd version in k/k? @ahrtr

Yes, please.

we can remove the image building code in k/k after some (maybe 2 patch) versions with no issues.

Let's hold on removing the etcd image build stuff now just in case. Propose to wait at least one release cycle (e.g. after K8s 1.35 is released) for safety.

ahrtr avatar Oct 03 '25 10:10 ahrtr

Let's hold on removing the etcd image build stuff now just in case. Propose to wait at least one release cycle (e.g. after K8s 1.35 is released) for safety.

Let's add a note that it's deprecated and planned to be removed after ...

hakman avatar Oct 03 '25 10:10 hakman

Let's add a note that it's deprecated and planned to be removed after ...

Makes sense.

ahrtr avatar Oct 03 '25 10:10 ahrtr

Note: consider only using etcd officially released multi-arch image tags, i.e. v3.6.6, and stop using tags like 3.6.6-0. cc @joshjms @hakman @neolit123 @ivanvc

links:

  • https://github.com/kubernetes/k8s.io/pull/8769#issuecomment-3569722807
  • https://github.com/kubernetes/k8s.io/pull/8769#issuecomment-3570032156

ahrtr avatar Nov 24 '25 10:11 ahrtr

like i said before, i don't mind that change. it might break some user's bash scripts and regexp-es who anticipate a -x suffix in the tag and who are using manual image tag parsing for their downstream purposes, like custom registries, but that's likely not a issue for the majority of users.

neolit123 avatar Nov 24 '25 10:11 neolit123

Note: consider only using etcd officially released multi-arch image tags, i.e. v3.6.6, and stop using tags like 3.6.6-0.

With regards to what @neolit123 said, is there a good way to notify the community that etcd no longer uses -0 tag images in k8s.io or is that not required?

joshjms avatar Nov 24 '25 11:11 joshjms

a related k8s PR can have a release note prefixed with action required: that puts it on top of the release notes for higher visibility.

neolit123 avatar Nov 24 '25 11:11 neolit123

Ahh ok that works :D.

joshjms avatar Nov 24 '25 11:11 joshjms

I think what can do,

  • Only use tags like v3.6.6 in Kubernetes workflow (this won't break anything, we can do it now)
  • For Kubeadm, it's still better to use the consistent tags (i.e. v3.6.6) rather than 3.6.6-0
    • But it may hurt the user experience, so we can't change it right away. We need a long deprecation process. We just need to say that starting from Kubernetes v1.36 or etcd v3.7, Kubeadm will switch from tags like 3.6.6-0 to v3.6.6.
    • For now, we will keep adding both tags (i.e. v3.6.6 and 3.6.6-0) when publishing etcd images.

ahrtr avatar Nov 24 '25 11:11 ahrtr