trident
trident copied to clipboard
Chart breaks when upgrading from k8s 1.24 to 1.25
Describe the bug Currently the chart doesnt survive an upgrade of k8s 1.24.x to 1.25 due to PodSecurityPolicies
Helm upgrade failed: unable to build kubernetes objects from current release manifest: resource mapping not found for name: "tridentoperatorpods" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"
ensure CRDs are installed first
Environment Provide accurate information about the environment to help us reproduce the issue.
- Trident version: [e.g. 19.10] 22.10
- Trident installation flags used: [e.g. -d -n trident --use-custom-yaml]
- Container runtime: [e.g. Docker 19.03.1-CE]
- Kubernetes version: [e.g. 1.15.1]
- Kubernetes orchestrator: [e.g. OpenShift v3.11, Rancher v2.3.3]
- Kubernetes enabled feature gates: [e.g. CSINodeInfo]
- OS: [e.g. RHEL 7.6, Ubuntu 16.04]
- NetApp backend types: [e.g. CVS for AWS, ONTAP AFF 9.5, HCI 1.7]
- Other:
To Reproduce Have the chart installed in an 1.24.x cluster upgrade K8s try to upgrade/change the chart
Expected behavior Chart doesnt break Additional context The root cause is https://github.com/NetApp/trident/blob/ee233f996b854b3ce2aee995467d1ab2519265cd/helm/trident-operator/templates/podsecuritypolicy.yaml#L1 Basically helm keeps track of the PSP and wants to remove it after this evaluates to false. But K8s doesnt know anything about that resource, thus helm fails. The only way to prevent this is to manually prevent PSPs from being created while being on 1.24 (which is bad and most people will forget) or to automatically drop PSPs in 1.24 unless manually enabled, making sure the resource is deleted when the api still knows it
yes, we've ran into the same problem. PSPs are deprecated long time now, trident should remove them completely from upstream manifests.
something else to note that 1.25 needs the latest trident version 23.01. https://github.com/NetApp/trident/releases/download/v23.01.0/trident-installer-23.01.0.tar.gz
According to Helm upgrade error after Kubernetes Upgrade to 1.25 with Trident installed the upgrade to trident version 23.01 removes PodSecurityPolicies.
Solution
* The upgrade to Trident 23.01 will fix, i.e. remove the Trident's PodSecurityPolicies.
* Another way to fix the issue, staying with Trident 22.10.0 is to uninstall Trident with Help and reinstall it with or without operator using tridentctl.
A few months ago I ran k8s 1.23 with trident 22.7.0 I first upgraded trident from 22.7.0 to 23.01.0 Next k8s from 1.23 => 1.24 => 1.25 => 1.26 without a hitch.
Now I'm trying to deploy trident 23.04.0 prior to k8s 1.27 but I'm still getting the same error as you :thinking:
$ helm list -n trident
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
trident-operator trident 2 2023-03-06 16:16:29.842913588 +0100 CET deployed trident-operator-23.01.0 23.01.0
$ helm upgrade -n trident trident-operator netapp-trident/trident-operator --version 23.04.0
Error: UPGRADE FAILED: unable to build kubernetes objects from current release manifest: resource mapping not found for name: "tridentoperatorpods" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"
ensure CRDs are installed first
On the helm upgrade line for Trident, add the parameter: --set excludePodSecurityPolicy=true
I'm afraid I already tried that, after checking the release notes: When upgrading a Kubernetes cluster from 1.24 to 1.25 or later that has Astra Trident installed, you must update values.yaml to set excludePodSecurityPolicy to true or add --set excludePodSecurityPolicy=true to the helm upgrade command before you can upgrade the cluster.
$ helm upgrade --set excludePodSecurityPolicy=true -n trident trident-operator netapp-trident/trident-operator --version 23.04.0
Error: UPGRADE FAILED: unable to build kubernetes objects from current release manifest: resource mapping not found for name: "tridentoperatorpods" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"
ensure CRDs are installed first
Could it be because I installed Trident while on 1.23, so the PSP's were there at the time, and Helm is now still looking for them? Because it wants to compare resources from the last time it upgraded trident? I installed trident 23.01 on k8s 1.23 prior to upgrading to 1.24, and never thought about excludePodSecurityPolicy later, so it was not set when I did 1.24 => 1.25.
I was able to fix it by editing the sh.helm.release.v1.trident-operator.v# secret's data.release contents.
More specifically, I think I removed these parts from it:
{"name":"templates/podsecuritypolicy.yaml","data":"e3st...IH19Cg=="},
as well as
# Source: trident-operator/templates/podsecuritypolicy.yaml\napiVersion: policy/v1beta1\nkind: PodSecurityPolicy\nmetadata:\n name: tridentoperatorpods\n labels:\n app: operator.trident.netapp.io\nspec:\n privileged: false\n seLinux:\n rule: RunAsAny\n supplementalGroups:\n rule: RunAsAny\n runAsUser:\n rule: RunAsAny\n fsGroup:\n rule: RunAsAny\n volumes:\n - projected\n---\n
Perhaps the latter one would have been enough.
After replacing data.release in the secret with the updated version (i.e. after applying gzip -c | base64 | base64 -w0 to the data again) I could upgrade successfully.
$ helm upgrade --set kubeletDir=/var/lib/k0s/kubelet --set excludePodSecurityPolicy=true -n trident trident-operator netapp-trident/trident-operator --version 23.04.0
Do we have any update on this? We are seeing the similar issue. I believe we have a bug ticket for this issue. https://github.com/NetApp/trident/issues/819
It would be nice to have any update or a bugfixrelease. Its nice that the operator will delete the psps during update, but unfortunately K8s distribution like OpenShift and RKE will not let you start an update to a version with K8s > 1.25 until there are no existing psps in the cluster.
Hello all, would recommend the Rancher article: https://ranchermanager.docs.rancher.com/how-to-guides/new-user-guides/authentication-permissions-and-global-configuration/pod-security-standards#cleaning-up-releases-after-a-kubernetes-v125-upgrade
Already tested:
export KUBECONFIG=...
helm -n trident list
NAME NAMESPACE REVISION CHART APP VERSION
trident-operator-22-1680184337 trident 4 trident-operator-23.04.0 23.04.0
helm mapkubeapis --dry-run -n trident trident-operator-22-1680184337 and without "--dry-run": helm mapkubeapis -n trident trident-operator-22-1680184337
Then upgrade to the same version but with: "exclude PodSecurityPolicy=true" or get helm chart and images for trident-operator-23.07.0 // upgrade to 23.07.0
Regards, temirg.