spark-operator icon indicating copy to clipboard operation
spark-operator copied to clipboard

Kubernetes v1.22 admissionregistration.k8s.io/v1beta1 api removal

Open Charmelionag opened this issue 3 years ago • 22 comments

Hello,

The new version of Kubernetes 1.22 does not support anymore the APIs used for the webhook creation. Are there releases scheduled to change the API kind for Webhook resources?

Charmelionag avatar Oct 26 '21 08:10 Charmelionag

I reported this in May, and I am starting to get worried. https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1265

jgoeres avatar Nov 02 '21 08:11 jgoeres

Hello team, I would like to work on this issue and create a PR for the required change.

sairamankumar2 avatar Nov 14 '21 18:11 sairamankumar2

Any news on this? We're now blocked in upgrading to latest 1.22 due to this. I can imagine we're not the only ones...

aneagoe avatar Nov 28 '21 14:11 aneagoe

Hi @aneagoe, I have already created a PR #1401 for this issue. Just need to add docs with a new version as this is a major change to the operator.

sairamankumar2 avatar Nov 28 '21 14:11 sairamankumar2

@sairamankumar2 wow, thanks a lot for updating the issue with the PR link and of course for your efforts! Looking forward to seeing this merged.

aneagoe avatar Nov 28 '21 15:11 aneagoe

The PR has been successfully merged and I think we can close this issue.

sairamankumar2 avatar Nov 30 '21 05:11 sairamankumar2

Image gcr.io/spark-operator/spark-operator:v1beta2-1.3.0-3.1.1 that contains the change in the PR has been built and pushed.

liyinan926 avatar Nov 30 '21 05:11 liyinan926

@sairamankumar2 @liyinan926 it seems there are some issues with the webhook creation and the pod fails to start with chart version 1.1.14 and image gcr.io/spark-operator/spark-operator:v1beta2-1.3.0-3.1.1:

++ id -u
+ myuid=0
++ id -g
+ mygid=0
+ set +e
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/bash
+ set -e
+ echo 0
+ echo 0
+ echo root:x:0:0:root:/root:/bin/bash
+ [[ -z root:x:0:0:root:/root:/bin/bash ]]
0
0
root:x:0:0:root:/root:/bin/bash
+ exec /usr/bin/tini -s -- /usr/bin/spark-operator -v=2 -logtostderr -namespace=spark -enable-ui-service=true '-ingress-url-format={{$appName}}.dwzug.tensor.tech' -controller-threads=10 -resync-interval=30 -enable-batch-scheduler=false -label-selector-filter= -enable-metrics=true -metrics-labels=app_type -metrics-port=10254 -metrics-endpoint=/metrics -metrics-prefix= -enable-webhook=true -webhook-svc-namespace=spark -webhook-port=8080 -webhook-timeout=30 -webhook-svc-name=spark-operator-webhook -webhook-config-name=spark-operator-webhook-config -webhook-namespace-selector= -enable-resource-quota-enforcement=false
I1130 14:28:51.806391      10 main.go:145] Starting the Spark Operator
I1130 14:28:51.806617      10 main.go:178] Enabling metrics collecting and exporting to Prometheus
I1130 14:28:51.806701      10 metrics.go:142] Started Metrics server at localhost:10254/metrics
I1130 14:28:51.807346      10 webhook.go:219] Starting the Spark admission webhook server
I1130 14:28:51.821438      10 webhook.go:415] Creating a MutatingWebhookConfiguration for the Spark pod admission webhook
F1130 14:28:51.823450      10 main.go:209] MutatingWebhookConfiguration.admissionregistration.k8s.io "spark-operator-webhook-config" is invalid: [webhooks[0].sideEffects: Required value: must specify one of None, NoneOnDryRun, webhooks[0].admissionReviewVersions: Required value: must specify one of v1, v1beta1]

aneagoe avatar Nov 30 '21 14:11 aneagoe

It looks like there's a different behavior for sideEffects and admissionReviewVersions in v1. Basically, in v1beta1 they used to have defaults and now they do not. Code needs to be adjusted such that both these parameters are specified when the MutatingWebhookConfiguration is created.

aneagoe avatar Dec 01 '21 13:12 aneagoe

@aneagoe, As I see it, these changes were added in PR #1413

ffxxe avatar Dec 01 '21 14:12 ffxxe

That's the exact error I bumped into when testing the latests changes out. I've been able to deploy the webhook successfully with the code from that PR.

ssullivan avatar Dec 02 '21 13:12 ssullivan

Thanks, I see the PR is merged now. However, the chart seems to reference an inexistent image:

34s         Warning   Failed             pod/spark-operator-webhook-cleanup-vjfdw                Failed to pull image "gcr.io/spark-operator/spark-operator:v1beta2-1.3.1-3.1.1": rpc error: code = NotFound desc = failed to pull and unpack image "gcr.io/spark-operator/spark-operator:v1beta2-1.3.1-3.1.1": failed to resolve reference "gcr.io/spark-operator/spark-operator:v1beta2-1.3.1-3.1.1": gcr.io/spark-operator/spark-operator:v1beta2-1.3.1-3.1.1: not found

One could of course build manually and work around this, but it would be nice to release spark-operator chart and dependent images together.

aneagoe avatar Dec 03 '21 08:12 aneagoe

I’m running into the same thing right now. I asked in the slack channel about the process for getting new images uploaded. The current ci setup doesn’t automatically upload them to gcr. I’m building the image locally to work around it right now

ssullivan avatar Dec 03 '21 13:12 ssullivan

I've been doing some testing and I think there are some further modifications that need to be made. I noticed that while the webhook was returning an AdmissionResponse, the pod was not modified. I believe it's because k8s didn't think it was a valid AdmissionResponse object.

ssullivan avatar Dec 03 '21 19:12 ssullivan

This change https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/pull/1421 resolved it in my testing.

ssullivan avatar Dec 03 '21 23:12 ssullivan

@liyinan926 would it be possible to push an image containing the PRs mentioned here to gcr.io? I'm not exactly sure what the release process looks like and we've added this image internally but it would be great to not have to rely on private builds.

aneagoe avatar Dec 07 '21 13:12 aneagoe

Any update on that issue ?

BenMizrahiPlarium avatar Dec 12 '21 16:12 BenMizrahiPlarium

Any update on getting fixes pushed? I'm getting the following error when enabling the webhook:

I1220 20:37:06.712175 10 webhook.go:415] Creating a MutatingWebhookConfiguration for the Spark pod admission webhook F1220 20:37:06.719623 10 main.go:209] MutatingWebhookConfiguration.admissionregistration.k8s.io "spark-operator-webhook-config" is invalid: [webhooks[0].sideEffects: Required value: must specify one of None, NoneOnDryRun, webhooks[0].admissionReviewVersions: Required value: must specify one of v1, v1beta1]

declark1 avatar Dec 20 '21 20:12 declark1

Also waiting eagerly for an official released version containing these changes, do we have an ETA for that?

jgoeres avatar Jan 04 '22 12:01 jgoeres

Any update on when this might be addressed?

andrijaperovic avatar Jan 22 '22 01:01 andrijaperovic

This has been addressed already, see https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1450. In short, you can use image ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.3-3.1.1. For an overview of images published to the new registry, you can check https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/pkgs/container/spark-operator. You'll simply need to set some helm overrides: --set image.tag=v1beta2-1.3.3-3.1.1 --set image.repository=ghcr.io/googlecloudplatform/spark-operator

aneagoe avatar Jan 22 '22 06:01 aneagoe

@aneagoe : Do you know when this version will be available in dockerhub? I have been asked to update the version of spark operator which will be supported by openshift 4.9

sp-matrix avatar Aug 12 '22 19:08 sp-matrix