spark-operator icon indicating copy to clipboard operation
spark-operator copied to clipboard

Support spark 3.4.0

Open hajapy opened this issue 1 year ago • 7 comments

Spark 3.4.0 has been released: https://spark.apache.org/releases/spark-release-3-4-0.html

Is there a timeline for support within spark operator?

hajapy avatar May 04 '23 11:05 hajapy

The operator is not hard wired to any specific spark versions, so you can just package up your application with a spark 3.4.0 image as base and I'm certain that will work. If there is anything specific that is broken, then I suggest you report that specifically

jalkjaer avatar May 05 '23 16:05 jalkjaer

linked to #1559 and #1688

We all agree that the sparkapplication can run in spark version 3.4.0, however spark operator image itself is built on spark 3.1.1 only.

Some of us have security rules in kubernetes like : images with known critical CVE are not allowed in prod environment. This prevent us from using vanilla spark-operator.

Besides that : spark_uid=185 also breaks these rules as our rules requires UID be >=10000

julienlau avatar May 22 '23 09:05 julienlau

I obtained a seemingly working version by checking out tag v1beta2-1.3.8-3.1.1 and building with:

docker build --build-arg SPARK_IMAGE=apache/spark:3.4.1 -t spark-operator:v1beta2-1.3.8-3.4.1 .

I specified my version when installing the Helm chart and managed to run the Spark Pi example by setting:

image: "apache/spark:3.4.1"
mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-3.4.1.jar"
sparkVersion: "3.4.1"

in the spark-pi.yaml file.

nmusatti avatar Jun 27 '23 16:06 nmusatti

Following @nmusatti advice, tried to build a new image for spark operator but for 3.3.1 spark version (that should be no different than 3.4.1). docker build --build-arg SPARK_IMAGE=apache/spark:3.3.1 -t spark-operator:v1beta2-1.3.8-3.3.1 .

Then I installed spark-operator using helm and values.yaml file:

helm install spark spark-operator/spark-operator --namespace spark-operator --create-namespace -f values.yaml

values.yaml file:

image:
  # -- Image repository
  repository: zebre8844/spark-operator
  # -- Image pull policy
  pullPolicy: IfNotPresent
  # -- if set, override the image tag whose default is the chart appVersion.
  tag: "v1beta2-1.3.8-3.3.1"

Installation is OK but if I tried the spark-pi.yaml:

apiVersion: v1
items:
- apiVersion: v1
  kind: ServiceAccount
  metadata:
    creationTimestamp: "2023-09-26T08:20:02Z"
    name: default
    namespace: spark-apps
    resourceVersion: "756092"
    uid: 45b1f0e0-4e0c-4367-8bc7-9eac2c2de863
- apiVersion: v1
  kind: ServiceAccount
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"name":"spark","namespace":"spark-apps"}}
    creationTimestamp: "2023-09-26T08:22:31Z"
    name: spark
    namespace: spark-apps
    resourceVersion: "756624"
    uid: fbe86af1-26bf-4097-8b8a-9234e4c95c0a
kind: List
metadata:
  resourceVersion: ""
[root@ocd9-master-1 ~]# cat /opt/ocd9/spark-operator/spark-pi.yaml
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  annotations:
    helm.sh/hook: pre-install, pre-upgrade
    helm.sh/hook-delete-policy: hook-failed, before-hook-creation
    helm.sh/hook-weight: "-10"
  labels:
    app.kubernetes.io/instance: spark
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: spark-operator
    app.kubernetes.io/version: v1beta2-1.3.8-3.3.1
    helm.sh/chart: spark-operator-1.1.27
  name: spark-pi
  namespace: spark-apps
spec:
  type: Scala
  mode: cluster
  image: "xxxxxxx/spark-operator:v1beta2-1.3.8-3.3.1"
  imagePullPolicy: Always
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-3.3.1.jar"
  sparkVersion: "3.3.1"
  restartPolicy:
    type: Never
  volumes:
    - name: "spark-data"
      hostPath:
        path: "/tmp"
        type: Directory
  driver:
    cores: 2
    #coreLimit: "200m"
    memory: "512m"
    labels:
      version: 3.3.1
    serviceAccount: spark
    volumeMounts:
      - name: "spark-data"
        mountPath: "/tmp"
  executor:
    cores: 1
    instances: 1
    memory: "512m"
    labels:
      version: 3.3.1
    serviceAccount: spark    
    volumeMounts:
      - name: "spark-data"
        mountPath: "/tmp"

Pod spark-pi-driver is Running but there are these errors:

W0926 13:22:25.514632 10 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1beta2.ScheduledSparkApplication: scheduledsparkapplications.sparkoperator.k8s.io is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "scheduledsparkapplications" in API group "sparkoperator.k8s.io" at the cluster scope E0926 13:22:25.514674 10 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1beta2.ScheduledSparkApplication: failed to list *v1beta2.ScheduledSparkApplication: scheduledsparkapplications.sparkoperator.k8s.io is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "scheduledsparkapplications" in API group "sparkoperator.k8s.io" at the cluster scope W0926 13:22:31.030031 10 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1beta2.SparkApplication: sparkapplications.sparkoperator.k8s.io is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "sparkapplications" in API group "sparkoperator.k8s.io" at the cluster scope E0926 13:22:31.030059 10 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1beta2.SparkApplication: failed to list *v1beta2.SparkApplication: sparkapplications.sparkoperator.k8s.io is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "sparkapplications" in API group "sparkoperator.k8s.io" at the cluster scope W0926 13:22:52.106555 10 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "pods" in API group "" at the cluster scope E0926 13:22:52.106601 10 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:spark-apps:spark" cannot list resource "pods" in API group "" at the cluster scope

I've checked serviceaccount, roles, roles bindings, ..... These are the same than when installing spark-operator 3.1.1 which works.

May be I missed something whene building image?

Thanks

md-software avatar Sep 26 '23 13:09 md-software

+1 following

andreyolv avatar Oct 23 '23 17:10 andreyolv

I've fixed the problem. Image used in deployment and sparkapplications object was spark-operator image, no spark image. Changing that fixed the issue.

md-software avatar Oct 24 '23 13:10 md-software

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Aug 14 '24 06:08 github-actions[bot]