spark-operator
spark-operator copied to clipboard
Support spark 3.4.0
Spark 3.4.0 has been released: https://spark.apache.org/releases/spark-release-3-4-0.html
Is there a timeline for support within spark operator?
The operator is not hard wired to any specific spark versions, so you can just package up your application with a spark 3.4.0 image as base and I'm certain that will work. If there is anything specific that is broken, then I suggest you report that specifically
linked to #1559 and #1688
We all agree that the sparkapplication can run in spark version 3.4.0, however spark operator image itself is built on spark 3.1.1 only.
Some of us have security rules in kubernetes like : images with known critical CVE are not allowed in prod environment. This prevent us from using vanilla spark-operator.
Besides that : spark_uid=185 also breaks these rules as our rules requires UID be >=10000
I obtained a seemingly working version by checking out tag v1beta2-1.3.8-3.1.1 and building with:
docker build --build-arg SPARK_IMAGE=apache/spark:3.4.1 -t spark-operator:v1beta2-1.3.8-3.4.1 .
I specified my version when installing the Helm chart and managed to run the Spark Pi example by setting:
image: "apache/spark:3.4.1"
mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-3.4.1.jar"
sparkVersion: "3.4.1"
in the spark-pi.yaml file.
Following @nmusatti advice, tried to build a new image for spark operator but for 3.3.1 spark version (that should be no different than 3.4.1).
docker build --build-arg SPARK_IMAGE=apache/spark:3.3.1 -t spark-operator:v1beta2-1.3.8-3.3.1 .
Then I installed spark-operator using helm and values.yaml file:
helm install spark spark-operator/spark-operator --namespace spark-operator --create-namespace -f values.yaml
values.yaml file:
image:
# -- Image repository
repository: zebre8844/spark-operator
# -- Image pull policy
pullPolicy: IfNotPresent
# -- if set, override the image tag whose default is the chart appVersion.
tag: "v1beta2-1.3.8-3.3.1"
Installation is OK but if I tried the spark-pi.yaml:
apiVersion: v1
items:
- apiVersion: v1
kind: ServiceAccount
metadata:
creationTimestamp: "2023-09-26T08:20:02Z"
name: default
namespace: spark-apps
resourceVersion: "756092"
uid: 45b1f0e0-4e0c-4367-8bc7-9eac2c2de863
- apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"name":"spark","namespace":"spark-apps"}}
creationTimestamp: "2023-09-26T08:22:31Z"
name: spark
namespace: spark-apps
resourceVersion: "756624"
uid: fbe86af1-26bf-4097-8b8a-9234e4c95c0a
kind: List
metadata:
resourceVersion: ""
[root@ocd9-master-1 ~]# cat /opt/ocd9/spark-operator/spark-pi.yaml
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
annotations:
helm.sh/hook: pre-install, pre-upgrade
helm.sh/hook-delete-policy: hook-failed, before-hook-creation
helm.sh/hook-weight: "-10"
labels:
app.kubernetes.io/instance: spark
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: spark-operator
app.kubernetes.io/version: v1beta2-1.3.8-3.3.1
helm.sh/chart: spark-operator-1.1.27
name: spark-pi
namespace: spark-apps
spec:
type: Scala
mode: cluster
image: "xxxxxxx/spark-operator:v1beta2-1.3.8-3.3.1"
imagePullPolicy: Always
mainClass: org.apache.spark.examples.SparkPi
mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-3.3.1.jar"
sparkVersion: "3.3.1"
restartPolicy:
type: Never
volumes:
- name: "spark-data"
hostPath:
path: "/tmp"
type: Directory
driver:
cores: 2
#coreLimit: "200m"
memory: "512m"
labels:
version: 3.3.1
serviceAccount: spark
volumeMounts:
- name: "spark-data"
mountPath: "/tmp"
executor:
cores: 1
instances: 1
memory: "512m"
labels:
version: 3.3.1
serviceAccount: spark
volumeMounts:
- name: "spark-data"
mountPath: "/tmp"
Pod spark-pi-driver is Running but there are these errors:
W0926 13:22:25.514632 10 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1beta2.ScheduledSparkApplication: scheduledsparkapplications.sparkoperator.k8s.io is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "scheduledsparkapplications" in API group "sparkoperator.k8s.io" at the cluster scope E0926 13:22:25.514674 10 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1beta2.ScheduledSparkApplication: failed to list *v1beta2.ScheduledSparkApplication: scheduledsparkapplications.sparkoperator.k8s.io is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "scheduledsparkapplications" in API group "sparkoperator.k8s.io" at the cluster scope W0926 13:22:31.030031 10 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1beta2.SparkApplication: sparkapplications.sparkoperator.k8s.io is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "sparkapplications" in API group "sparkoperator.k8s.io" at the cluster scope E0926 13:22:31.030059 10 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1beta2.SparkApplication: failed to list *v1beta2.SparkApplication: sparkapplications.sparkoperator.k8s.io is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "sparkapplications" in API group "sparkoperator.k8s.io" at the cluster scope W0926 13:22:52.106555 10 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "pods" in API group "" at the cluster scope E0926 13:22:52.106601 10 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "pods" in API group "" at the cluster scope
I've checked serviceaccount, roles, roles bindings, ..... These are the same than when installing spark-operator 3.1.1 which works.
May be I missed something whene building image?
Thanks
+1 following
I've fixed the problem. Image used in deployment and sparkapplications object was spark-operator image, no spark image. Changing that fixed the issue.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.