spark-operator icon indicating copy to clipboard operation
spark-operator copied to clipboard

node affinity doesnt work

Open yangqi766 opened this issue 2 years ago • 5 comments

This functional requirement is very important. For example, if there are CPU machines and GPU machines in the node, node scheduling must be required!!

企业微信截图_20220427134954

yangqi766 avatar Apr 27 '22 05:04 yangqi766

In my previous experience, that may be caused by mutating-admission-webhook missing.

https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/quick-start-guide.md#about-the-mutating-admission-webhook

oscar-dela avatar May 15 '22 07:05 oscar-dela

I use nodeSelector and it works for me, in case you want to give it a try.

josecsotomorales avatar Jun 28 '22 02:06 josecsotomorales

I had the same issue and was able to get it working using a pod template file without the need to enable the webhook https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1176#issuecomment-1179287656

elihschiff avatar Jul 08 '22 19:07 elihschiff

I had the same issue and was able to get it working using a pod template file without the need to enable the webhook #1176 (comment)

can you give a example of the pod template ?

ghost avatar Jul 19 '22 03:07 ghost

My pod template has a lot of custom settings for my infrastructure. However, any valid pod yaml file should work. The spark/k8s docs have more details https://spark.apache.org/docs/latest/running-on-kubernetes.html#pod-template https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/pod-templates.html

elihschiff avatar Jul 19 '22 03:07 elihschiff

--conf spark.kubernetes.node.selector.kubernetes.io/hostname=sz-exa-cpu-10 works for me.

Ref: https://spark.apache.org/docs/latest/running-on-kubernetes.html

archongum avatar Jul 06 '23 07:07 archongum