cilium-cli icon indicating copy to clipboard operation
cilium-cli copied to clipboard

Unable to Upgrade from 1.12.10 to 1.13.4 Using Helm Mode

Open learnitall opened this issue 1 year ago • 0 comments

Bug report

General Information

  • Cilium CLI version (run cilium version)
cilium-cli: v0.15.0 compiled with go1.20.4 on linux/amd64
cilium image (default): v1.13.4
cilium image (stable): v1.13.4
cilium image (running): -ci:36cb0eed03c8bc6576f3fd33a94440c70ae18974-unstripped
  • Orchestration system version in use (e.g. kubectl version, ...)
clientVersion:
  buildDate: "1980-01-01T00:00:00Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: archive
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/amd64
kustomizeVersion: v5.0.1
serverVersion:
  buildDate: "2023-06-01T19:55:11Z"
  compiler: gc
  gitCommit: 4eb57372b0f0ac0023caca078161b75185febeef
  gitTreeState: clean
  gitVersion: v1.26.5-gke.1200
  goVersion: go1.19.9 X:boringcrypto
  major: "1"
  minor: "26"
  platform: linux/amd64
  • Platform / infrastructure information (e.g. AWS / Azure / GCP, image / kernel versions): GCP, Container-Optimized OS build 7162.210.18, Kernel 5.15.107

How to reproduce the issue

  1. Create a cluster on GKE.
  2. Put cilium-cli in helm mode.
  3. Install Cilium 1.12.10. I used the following command:
cilium install --version v1.12.10 \
  --set upgradeCompatibility=v1.12.10 \
  --set image.override=quay.io/cilium/cilium-ci:628b5209efd22e8a102919564c5c30277ca1c9b0-unstripped \
  --set operator.image.override=quay.io/cilium/operator-generic-ci:628b5209efd22e8a102919564c5c30277ca1c9b0-unstripped
  1. Upgrade Cilium to 1.13.3. I used the following command:
cilium upgrade --version v1.13.3 \
  --set upgradeCompatibility=v1.12.10 \
  --set image.override=quay.io/cilium/cilium-ci:36cb0eed03c8bc6576f3fd33a94440c70ae18974-unstripped \
  --set operator.image.override=quay.io/cilium/operator-generic-ci:36cb0eed03c8bc6576f3fd33a94440c70ae18974-unstripped

Install fails in cilium-agent init, with kubectl describe giving something like this:

Warning  FailedMount  9m9s (x22 over 39m)  kubelet            MountVolume.SetUp failed for volume "cni-path" : mkdir /opt/cni: read-only file system
  Warning  FailedMount  3m41s (x9 over 19m)  kubelet            (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[cni-path], unattached volumes=[cni-path tmp hostproc host-proc-sys-kernel cilium-cgroup host-proc-sys-net bpf-maps cilium-run kube-api-access-gj8bc clustermesh-secrets lib-modules xtables-lock etc-cni-netd hubble-tls]: timed out waiting for the condition

During the upgrade, helm values are reset to their defaults in the chart. Platform-specific information isn't added into the values, as would happen during install, so the needed values for GKE aren't used. Passing --reuse-values doesn't work, as I get the following:

Error: Unable to upgrade Cilium: template: cilium/templates/hubble/dashboards-configmap.yaml:1:14: executing "cilium/templates/hubble/dashboards-configmap.yaml" at <.Values.hubble.metrics.dashboards.enabled>: nil pointer evaluating interface {}.enabled

I think this is an expected error given the warning about --reuse-values in our current documentation here.

As a workaround, I'm following the instructions in the docs mentioned above, saving helm values and passing them to the upgrade command using -f. This preserves the GKE-specific options needed for installation.

I think we should add functionality to upgrade to preserve platform-specific options, or at least add information in our documentation about saving values when cilium-cli is in helm mode.

learnitall avatar Jul 11 '23 21:07 learnitall