intel-device-plugins-for-kubernetes icon indicating copy to clipboard operation
intel-device-plugins-for-kubernetes copied to clipboard

Prepare 0.30.0 release

Open tkatila opened this issue 1 year ago • 9 comments

Checklist:

  • [x] run validation on main
    • [x] QAT (generic)
    • [x] GNR (IAA, DSA)
    • [x] GNR-D (QAT)
    • [x] FPGA
    • [x] SPR (SGX, QAT, GPU, IAA, DSA)
  • [x] Make sure kube-rbac-proxy is the latest version
  • [x] create release-0.30 branch
  • [x] release branch changes
    • [x] edit default_labels.docker + make dockerfiles
    • [x] make set-version TAG=0.30.0 + commit
    • [x] update publish.yml to create docs for v0.30
  • [x] draft release notes, review
  • [x] publish release
  • [x] main branch changes
    • [x] update base README for supported versions and docs URL
    • [x] update main branch's operator CRs to point to 0.30, also reconciler.go
  • [x] update helm chart: PR
    • [x] Make sure to update CRDs and README
  • [ ] update operatorhub.io bundle

tkatila avatar May 07 '24 17:05 tkatila

There is an error related to this commit. https://github.com/k8s-operatorhub/community-operators/actions/runs/9137548241/job/25127696012?pr=4366#step:3:5083

And, it seems that the ci/cd tests in the operatorhub has a specific namespace 'testeupgrade', which may mean that we cannot publish the bundle as it is now.

I tested also locally, and it shows the same error messages. In addition, when I test removing the contents of the commit above, it runs successfully.

What do we need to do?

hj-johannes-lee avatar May 18 '24 06:05 hj-johannes-lee

What do we need to do?

Find out what the error is about and plan the fix accordingly. I'm not clear why it fails. Did you check what the test case is about and what we are doing wrong?

mythi avatar May 20 '24 05:05 mythi

I wonder if it's some upgrade test where the changed labels causes confusion.

edit: nevermind, apparently I can't read.

The fix that is causing this was related to the operator bundle (or multiples of them) so reverting the fix would just re-introduce the issue. Kinda.

tkatila avatar May 20 '24 05:05 tkatila

Thanks to the help of @tkatila, i figured out that it is not possible to change the labels from the previous version.

We added one more from the previous version, so it is not possible to upgrade from the previous version. I can see some similar case (https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/issues/7331).

I found one source that talks about solving this problem. https://olm.operatorframework.io/docs/troubleshooting/clusterserviceversion/

So, we may need to publish 0.30.0 that does not cause a problem with the addition of new label and then 0.30.1 which would be 'real version' of published operator.

hj-johannes-lee avatar May 20 '24 13:05 hj-johannes-lee

https://github.com/k8s-operatorhub/community-operators/pull/4375 I can see all tests got passed. So, I guess, we have two options.

  1. Change the name of the deployment from inteldeviceplugins-controller-manager to something else permenantly
  2. Change the name of the deployment from inteldeviceplugins-controller-manager to something else temporarily and change back with 0.30.1 version.

The reason why I am suggesting the second option is because I do not know if we need to 'keep' the current name inteldeviceplugins-controller-manager.

hj-johannes-lee avatar May 20 '24 13:05 hj-johannes-lee

@mythi @tkatila Let me know which way you think is better! :)

hj-johannes-lee avatar May 21 '24 07:05 hj-johannes-lee

I'm trying to think of a way that would not include bumping up the version number and creating a patch release.

If we update the name permanently, what are the downsides for it? Some upgrade somewhere would result in two copies of the operator? Are we sure a 0.29.0->0.30.0->0.30.1 upgrade path would work (changin name back and forth)? What if the user upgrades from 0.29.0 to 0.30.1, wouldn't he/she get the same error?

tkatila avatar May 21 '24 11:05 tkatila

  1. Change the name of the deployment from inteldeviceplugins-controller-manager to something else permenantly

What happens to the old deployment if you add a new (renamed) one as part of the OLM upgrade?

mythi avatar May 22 '24 06:05 mythi

I submitted a question to the community operators project: https://github.com/k8s-operatorhub/community-operators/issues/4434

tkatila avatar May 30 '24 11:05 tkatila

It seems that they are not replying anything. Can we just proceed as official document suggests? (changing the inteldeviceplugins-controller-manager to something else permanently)

hj-johannes-lee avatar Jul 09 '24 14:07 hj-johannes-lee

After discussing with @hj-johannes-lee I'd propose a transient deployment name change in the operator bundle:

  1. Release 0.30.0 with a different deployment name (only in operator bundle)
  2. Keep deployment name as-is in the main branch
  3. With 0.31.0 release in the operator bundle, the deployment name would "revert" back to the original one
  4. In the 0.31.0 release notes, we would make a note that upgrade from <=0.29.0 to 0.31.0 is not possible without going to 0.30.0 first.

tkatila avatar Jul 15 '24 09:07 tkatila

@mythi https://github.com/k8s-operatorhub/community-operators/pull/4375 It's ready to be merged. If you agree to go forward, let me make it merged.

hj-johannes-lee avatar Jul 23 '24 12:07 hj-johannes-lee

@mythi k8s-operatorhub/community-operators#4375 It's ready to be merged. If you agree to go forward, let me make it merged.

what is the reason for the step 3.?

mythi avatar Jul 23 '24 13:07 mythi

Umm, to be honest, I think there would be no problem to change to something else permanently (only when it comes to the operatorhub bundle). But, Tuomas thought there might be some problems.

hj-johannes-lee avatar Jul 23 '24 14:07 hj-johannes-lee

What about then first letting the pr merged and then decide about step 3,4 later?

hj-johannes-lee avatar Jul 23 '24 22:07 hj-johannes-lee

What about then first letting the pr merged and then decide about step 3,4 later?

works for me. can you also submit a PR here to get that warning fixed?

mythi avatar Jul 24 '24 05:07 mythi

What about then first letting the pr merged and then decide about step 3,4 later?

works for me. can you also submit a PR here to get that warning fixed?

nevermind, I just ran into #1785

mythi avatar Jul 24 '24 07:07 mythi

published. deployname is inteldeviceplugins-controller-manager-0-30-0

We can decide later if we change back to inteldeviceplugins-controller-manager or just create a new and permanent one.

hj-johannes-lee avatar Jul 24 '24 15:07 hj-johannes-lee

The new name looks odd and forces us to make a change...

mythi avatar Jul 24 '24 16:07 mythi

@mythi what name do you think is good?

hj-johannes-lee avatar Jul 25 '24 11:07 hj-johannes-lee

@mythi what name do you think is good?

something that is not attached to a specific version (e.g., 0-30-0)

mythi avatar Jul 25 '24 11:07 mythi

then from inteldeviceplugins-controller-manager to intel-deviceplugins-controller-manager https://github.com/k8s-operatorhub/community-operators/pull/4743#issuecomment-2250136002

hj-johannes-lee avatar Jul 25 '24 11:07 hj-johannes-lee

As discussed, let's keep the deployment name same in the bundle (as it is in 0.30.0), and change the deployment name in the project yamls. This will require manual changes for the 0.31.0 bundle but 0.32.0 onward shouldn't require any manual edits.

tkatila avatar Oct 01 '24 10:10 tkatila