Disable validation-webhook daemonset for managed clusters
When running the node density workloads on managed clusters, starting with 4.10.5 on ROSA, we are seeing an error like "admission webhook "regular-user-validation.managed.openshift.io" denied the request" when trying add a label to a node directly by using oc label node. The recommended way to add labels to nodes on managed ROSA clusters is by editing the machinepool. However, the Default machinepool cannot be edited to add labels and we see an error like "Labels cannot be updated on the Default machine pool". The only way to add a label is by disabling the validation-webhook daemonset and thereby admission cotnrol in the openshift-validation-webhook project that only exists on managed services clusters. We disable the daemonset by adding a fake nodeSelector before labeling the nodes and remove thenodeSelector after unlabeling the nodes. Adding a nodeSelector on top of the existing nodeAffinity means that both the conditions needs to be met for a pod to be scheduled. Also by adding the nodeSelector, the spec is not overwritten during reconcillation whereas changes to nodeAffinity are being overwritten.
Signed-off-by: Sai Sindhur Malleni [email protected]
Description
Fixes
When running the node density workloads on managed clusters, starting with 4.10.5 on ROSA, we are seeing an error like "admission webhook "regular-user-validation.managed.openshift.io" denied the request" when trying add a label to a node directly by using oc label node. The recommended way to add labels to nodes on managed ROSA clusters is by editing the machinepool. However, the Default machinepool cannot be edited to add labels and we see an error like "Labels cannot be updated on the Default machine pool". The only way to add a label is by disabling the validation-webhook daemonset and thereby admission cotnrol in the openshift-validation-webhook project that only exists on managed services clusters. We disable the daemonset by adding a fake nodeSelector before labeling the nodes and remove thenodeSelector after unlabeling the nodes. Adding a nodeSelector on top of the existing nodeAffinity means that both the conditions needs to be met for a pod to be scheduled. Also by adding the nodeSelector, the spec is not overwritten during reconcillation whereas changes to nodeAffinity are being overwritten.
Signed-off-by: Sai Sindhur Malleni [email protected]
Description
Fixes
curious, why do we need to label nodes?
I assume this is used for the old node-density implementation available in e2e-benchmarking, the implementation based on the kube-burner's OCP wrapper doesn't need to label nodes anymore.
Rather than keep fixing and updating the old implementation we should encourage users to use the new implementation
When running the node density workloads on managed clusters, starting with 4.10.5 on ROSA, we are seeing an error like "admission webhook "regular-user-validation.managed.openshift.io" denied the request" when trying add a label to a node directly by using oc label node. The recommended way to add labels to nodes on managed ROSA clusters is by editing the machinepool. However, the Default machinepool cannot be edited to add labels and we see an error like "Labels cannot be updated on the Default machine pool". The only way to add a label is by disabling the validation-webhook daemonset and thereby admission cotnrol in the openshift-validation-webhook project that only exists on managed services clusters. We disable the daemonset by adding a fake nodeSelector before labeling the nodes and remove thenodeSelector after unlabeling the nodes. Adding a nodeSelector on top of the existing nodeAffinity means that both the conditions needs to be met for a pod to be scheduled. Also by adding the nodeSelector, the spec is not overwritten during reconcillation whereas changes to nodeAffinity are being overwritten. Signed-off-by: Sai Sindhur Malleni [email protected]
Description
Fixes
curious, why do we need to label nodes?
I assume this is used for the old node-density implementation available in e2e-benchmarking, the implementation based on the kube-burner's OCP wrapper doesn't need to label nodes anymore.
Rather than keep fixing and updating the old implementation we should encourage users to use the new implementation
Oh, I didn't realize that was the case, thanks for surfacing that.