spark-operator icon indicating copy to clipboard operation
spark-operator copied to clipboard

Can't have webhook working in multiple namespaces

Open doctapp opened this issue 5 years ago • 13 comments

We're installing the spark op to multiple namespaces. Our trick for now is to installCRDs only for the first install. The problem is that a second install breaks the first install's webhook.

Any idea on how to support this, i.e., deploy the spark op to multiple k8s ns?

Thanks

doctapp avatar Apr 15 '20 20:04 doctapp

What you meant by breaks the first install's webhook?

liyinan926 avatar Apr 16 '20 14:04 liyinan926

Sorry, secrets and confimaps stopped being mounted in first install.

doctapp avatar Apr 16 '20 15:04 doctapp

Got it. Do you have the flag webhook-namespace-selector set? The selector field in the MutatingWebhookConfiguration is set based on the value of webhook-namespace-selector. Since MutatingWebhookConfiguration is global (non-namespaced), the webhook running in one namespace overrides the MutatingWebhookConfiguration created by the webhook in another namespace, and effectively changes the namespace selector.

liyinan926 avatar Apr 17 '20 16:04 liyinan926

Searching around the code and docs, I don't see it documented. It seems that wanting to have multiple spark operators installed in a development cluster is something others want to do be able to do as well. It would be great to have a recipe for doing that. And/or a pointer into the pattern as used in Kubernetes itself.

jkleckner avatar Apr 21 '20 03:04 jkleckner

Completely agree with @jkleckner . Currently, we have a hack for only installing the CRDs on first install and removing them on last install. We're still testing out the webhook part (thanks @liyinan926!).

doctapp avatar Apr 21 '20 13:04 doctapp

A good way to make the webhook work for multiple namespaces is to add some custom label to each of the namespaces, e.g., spark-operator-webhook-applicable=true, and then set the namespace selector flag -webhook-namespace-selector=spark-operator-webhook-applicable=true when starting the operator.

liyinan926 avatar Apr 21 '20 19:04 liyinan926

Hi, we tried many alternatives and all fail. Once we install the second spark op it prevents secrets/configmaps from being mounted in the first spark op. Here's what we ran:

kubectl label --overwrite ns ns1 ns1-webhook-applicable=true

helm install spark-operator ./charts/sparkoperator \
              --namespace ns1 \
              --set installCrds=true \
              --set operatorVersion=v1beta2-1.1.1-2.4.5 \
              --set sparkJobNamespace=ns1 \
              --set imagePullPolicy=Always \
              --set enable-metrics=true \
              --set enableWebhook=true \
              --set webhookPort=443 \
              --set webhookServiceNamespace=ns1 \
              --set webhookNamespaceSelector=ns1-webhook-applicable=true

kubectl label --overwrite ns ns2 ns2-webhook-applicable=true

helm install spark-operator ./charts/sparkoperator \
              --namespace ns2 \
              --set installCrds=false \
              --set operatorVersion=v1beta2-1.1.1-2.4.5 \
              --set sparkJobNamespace=ns2 \
              --set imagePullPolicy=Always \
              --set enable-metrics=true \
              --set enableWebhook=true \
              --set webhookPort=443 \
              --set webhookServiceNamespace=ns2 \
              --set webhookNamespaceSelector=ns2-webhook-applicable=true

We also tried with and without the webhookServiceNamespace which also failed.

Any idea what's wrong?

Thanks

doctapp avatar Apr 27 '20 21:04 doctapp

I have a problem which I think is related to this: #892

Turns out the chart is helm's incubator repo does not support this settings yet. After I added this flag myself it seems to work properly.

andizzle avatar Apr 28 '20 06:04 andizzle

Thanks for the pointers @andizzle. We were finally able to install multiple spark operators into different namespaces. Here's how to get it (super hacky):

  • Add label to each ns, e.g., ns1-webhook-applicable=true
  • Hack the Spark Op helm source to add webhookNamespaceSelector (e.g., -webhook-namespace-selector={{ .Values.webhookNamespaceSelector }})
  • Create unique MutatingWebhookConfiguration to be used in each namespace (e.g., -webhook-config-name={{ include "sparkoperator.fullname" . }}-webhook-config-{{ .Release.Namespace }})
  • Create unique spark-operator cluster role and cluster role bindings to be used in each namespace
  • Only use --set installCrds=true for first install
  • Uninstall using --no-hooks for all but the last install and clean-up spark-webhook-certs in the namespace where uninstalling the Spark Operator. For the last uninstall, uninstall using hooks enabled and manually delete the CRDs afterwards.

Thanks

doctapp avatar Apr 30 '20 17:04 doctapp

Hey @doctapp great to hear that! Can you create a PR for the instructions above?

liyinan926 avatar Apr 30 '20 17:04 liyinan926

Glad it worked out for you. I have a few questions around your approach:

  • I did a similar change in the helm chart but with a different namespace label convension: sparkoperator.webhook.selector={{ .Values.sparkJobNamespace }}. This saves one variable in the value file, I can't think of a case where the webhook should listens to a different namespace than {{ .Values.sparkJobNamespace }} ?
  • The webhook configname in the chart has {{ include "sparkoperator.fullname" . }} as prefix, so installing with helm would result this to be different in each install? I'm not sure if it's necessary to add a suffix {{ .Release.Namespace }} to difficiate it as well.
  • I'm not sure what's your usecase for two installs of SparkOperator but I'm not sure if set --set installCrds=true for both SparkOperator is an issue? Unless one of them gets deleted at some point.

andizzle avatar Apr 30 '20 19:04 andizzle

@liyinan926 Would do a PR on a consolidated repo between this one and https://github.com/helm/charts/tree/master/incubator/sparkoperator as I have no idea which one is supposed to be the master. Thanks :)

@andizzle:

  • Makes sense, can't see why you would listen do a different namespace
  • Yes, that's what we need for running multiple spark ops. Config name is cluster wide, therefore, you need to have multiple configs for deploying Spark Op to multiple namespaces
  • Use case for installing is running Spark jobs in isolation in different namespaces. If you set installCrds=true beyond the first install, you'll have problems when deleting the operators. Our use case requires different life cycles for each spark op instance, e.g., create ns1, create ns2, delete ns1, create ns1...

doctapp avatar Apr 30 '20 20:04 doctapp

Hi @doctapp we are trying to accomplish the same on Openshift cluster but facing same issue. Do you have any document or detailed steps to follow to overcome this issue. I am not sure downside of resolution not included in chart yet. I tried steps given by you above but looks like something missing or got updated in helmchart till time.

sbbagal13 avatar May 12 '22 22:05 sbbagal13