spark-operator
spark-operator copied to clipboard
Kubernetes v1.22 admissionregistration.k8s.io/v1beta1 api removal
Hello,
The new version of Kubernetes 1.22 does not support anymore the APIs used for the webhook creation. Are there releases scheduled to change the API kind for Webhook resources?
I reported this in May, and I am starting to get worried. https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1265
Hello team, I would like to work on this issue and create a PR for the required change.
Any news on this? We're now blocked in upgrading to latest 1.22 due to this. I can imagine we're not the only ones...
Hi @aneagoe, I have already created a PR #1401 for this issue. Just need to add docs with a new version as this is a major change to the operator.
@sairamankumar2 wow, thanks a lot for updating the issue with the PR link and of course for your efforts! Looking forward to seeing this merged.
The PR has been successfully merged and I think we can close this issue.
Image gcr.io/spark-operator/spark-operator:v1beta2-1.3.0-3.1.1
that contains the change in the PR has been built and pushed.
@sairamankumar2 @liyinan926 it seems there are some issues with the webhook creation and the pod fails to start with chart version 1.1.14
and image gcr.io/spark-operator/spark-operator:v1beta2-1.3.0-3.1.1
:
++ id -u
+ myuid=0
++ id -g
+ mygid=0
+ set +e
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/bash
+ set -e
+ echo 0
+ echo 0
+ echo root:x:0:0:root:/root:/bin/bash
+ [[ -z root:x:0:0:root:/root:/bin/bash ]]
0
0
root:x:0:0:root:/root:/bin/bash
+ exec /usr/bin/tini -s -- /usr/bin/spark-operator -v=2 -logtostderr -namespace=spark -enable-ui-service=true '-ingress-url-format={{$appName}}.dwzug.tensor.tech' -controller-threads=10 -resync-interval=30 -enable-batch-scheduler=false -label-selector-filter= -enable-metrics=true -metrics-labels=app_type -metrics-port=10254 -metrics-endpoint=/metrics -metrics-prefix= -enable-webhook=true -webhook-svc-namespace=spark -webhook-port=8080 -webhook-timeout=30 -webhook-svc-name=spark-operator-webhook -webhook-config-name=spark-operator-webhook-config -webhook-namespace-selector= -enable-resource-quota-enforcement=false
I1130 14:28:51.806391 10 main.go:145] Starting the Spark Operator
I1130 14:28:51.806617 10 main.go:178] Enabling metrics collecting and exporting to Prometheus
I1130 14:28:51.806701 10 metrics.go:142] Started Metrics server at localhost:10254/metrics
I1130 14:28:51.807346 10 webhook.go:219] Starting the Spark admission webhook server
I1130 14:28:51.821438 10 webhook.go:415] Creating a MutatingWebhookConfiguration for the Spark pod admission webhook
F1130 14:28:51.823450 10 main.go:209] MutatingWebhookConfiguration.admissionregistration.k8s.io "spark-operator-webhook-config" is invalid: [webhooks[0].sideEffects: Required value: must specify one of None, NoneOnDryRun, webhooks[0].admissionReviewVersions: Required value: must specify one of v1, v1beta1]
It looks like there's a different behavior for sideEffects
and admissionReviewVersions
in v1
. Basically, in v1beta1
they used to have defaults and now they do not. Code needs to be adjusted such that both these parameters are specified when the MutatingWebhookConfiguration is created.
@aneagoe, As I see it, these changes were added in PR #1413
That's the exact error I bumped into when testing the latests changes out. I've been able to deploy the webhook successfully with the code from that PR.
Thanks, I see the PR is merged now. However, the chart seems to reference an inexistent image:
34s Warning Failed pod/spark-operator-webhook-cleanup-vjfdw Failed to pull image "gcr.io/spark-operator/spark-operator:v1beta2-1.3.1-3.1.1": rpc error: code = NotFound desc = failed to pull and unpack image "gcr.io/spark-operator/spark-operator:v1beta2-1.3.1-3.1.1": failed to resolve reference "gcr.io/spark-operator/spark-operator:v1beta2-1.3.1-3.1.1": gcr.io/spark-operator/spark-operator:v1beta2-1.3.1-3.1.1: not found
One could of course build manually and work around this, but it would be nice to release spark-operator chart and dependent images together.
I’m running into the same thing right now. I asked in the slack channel about the process for getting new images uploaded. The current ci setup doesn’t automatically upload them to gcr. I’m building the image locally to work around it right now
I've been doing some testing and I think there are some further modifications that need to be made. I noticed that while the webhook was returning an AdmissionResponse, the pod was not modified. I believe it's because k8s didn't think it was a valid AdmissionResponse object.
This change https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/pull/1421 resolved it in my testing.
@liyinan926 would it be possible to push an image containing the PRs mentioned here to gcr.io? I'm not exactly sure what the release process looks like and we've added this image internally but it would be great to not have to rely on private builds.
Any update on that issue ?
Any update on getting fixes pushed? I'm getting the following error when enabling the webhook:
I1220 20:37:06.712175 10 webhook.go:415] Creating a MutatingWebhookConfiguration for the Spark pod admission webhook F1220 20:37:06.719623 10 main.go:209] MutatingWebhookConfiguration.admissionregistration.k8s.io "spark-operator-webhook-config" is invalid: [webhooks[0].sideEffects: Required value: must specify one of None, NoneOnDryRun, webhooks[0].admissionReviewVersions: Required value: must specify one of v1, v1beta1]
Also waiting eagerly for an official released version containing these changes, do we have an ETA for that?
Any update on when this might be addressed?
This has been addressed already, see https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1450. In short, you can use image ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.3-3.1.1
. For an overview of images published to the new registry, you can check https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/pkgs/container/spark-operator.
You'll simply need to set some helm overrides: --set image.tag=v1beta2-1.3.3-3.1.1 --set image.repository=ghcr.io/googlecloudplatform/spark-operator
@aneagoe : Do you know when this version will be available in dockerhub? I have been asked to update the version of spark operator which will be supported by openshift 4.9