spark-operator icon indicating copy to clipboard operation
spark-operator copied to clipboard

Use intstr.IntOrString for cores in SparkApplication

Open rmannibucau opened this issue 1 month ago • 7 comments

What feature you would like to be added?

I'd like to be able to give to the driver container < 1 core (100m)

Why is this needed?

Currently cores is an int(32)

Describe the solution you would like

intstr.IntOrString or Value sounds like options to have int-or-string support like in any k8s container

Describe alternatives you have considered

mutation/webhooks but not frictionless

Additional context

No response

Love this feature?

Give it a 👍 We prioritize the features with most 👍

rmannibucau avatar Oct 28 '25 11:10 rmannibucau

Value of sparkapplication.spec.[driver|executor].cores will be passed to spark-submit script as parameters --driver-cores or --executor-cores. Currently, these two parameters are required to be integers.

$ ${SPARK_HOME}/bin/spark-submit --help
...
 Cluster deploy mode only:
  --driver-cores NUM          Number of cores used by the driver, only in cluster mode
                              (Default: 1).
...
 Spark standalone, YARN and Kubernetes only:
  --executor-cores NUM        Number of cores used by each executor. (Default: 1 in
                              YARN and K8S modes, or all available cores on the worker
                              in standalone mode).

ChenYi015 avatar Oct 28 '25 11:10 ChenYi015

@ChenYi015 guess it is fine to pass 0.125 or 100m and --driver-cores 1 (ceiled value)

rmannibucau avatar Oct 28 '25 12:10 rmannibucau

@ChenYi015 guess it is fine to pass 0.125 or 100m and --driver-cores 1 (ceiled value)

Yes. It is possible to ahchieve this with the webhook. When a driver/executor pod is created, the webhook will mutate the cores to what sparkapp spec specifies.

ChenYi015 avatar Oct 29 '25 06:10 ChenYi015

Hii @rmannibucau @ChenYi015 any update on this issue?? if you are not working on this can i work on this issue ??

rahul810050 avatar Dec 08 '25 07:12 rahul810050

If you are running Spark on Kubernetes, you can set coreRequest: '100m' instead of updating the cores , and the cores will remain unchanged. In this case, the pod will request 100m CPUs.

houyuting avatar Dec 09 '25 05:12 houyuting

Hii @rmannibucau !! thanks for the confirmation i will work on this issue and make a PR. Feel free to review it whenever you get chance.

rahul810050 avatar Dec 09 '25 06:12 rahul810050

Hii @rmannibucau !! thanks for the confirmation i will work on this issue and make a PR. Feel free to review it whenever you get chance. hi, @rahul810050 if you support driver can set intstr, executor cores also will support ? if yes, spark.executor.cores is used to calculate the maximum number of tasks that can be assigned to a single executor. It's used in conjunction with spark.task.cores. If spark.executor.cores is no longer an integer, like 0.15 , after yours PR, how many tasks will be assigned a single executor ?

houyuting avatar Dec 09 '25 06:12 houyuting