feat: add gRPC Spark submitter plugin + proto (optional plugin for spark-submit via gRPC)
Summary
This PR adds a configurable gRPC plugin for submitting Spark applications (instead of invoking ${SPARK_HOME}/bin/spark-submit).
Submitting via gRPC reduces per-submission overhead (no new process per submission) and enables improved controller throughput
when reconciling many new SparkApplications.
What this PR adds / changes
-
proto/sparksubmit/spark_submit.proto- Proto service definition for a
SparkSubmitServiceand messages forSubmitRequest/SubmitResponse.
- Proto service definition for a
-
Generated protobuf stubs (under
proto/sparksubmit/):spark_submit.pb.gospark_submit_grpc.pb.go
-
internal/controller/sparkapplication/grpc_submitter.go- New
GRPCSubmitterimplementing the existingSparkApplicationSubmitterinterface. - Uses the generated gRPC client to call a remote Spark submit service.
- Fallback remains the existing
SparkSubmitterwhich runsspark-submitlocally.
- New
-
internal/controller/sparkapplication/grpc_submitter_test.go- Unit tests for the gRPC submitter (server stubbed locally in tests).
- Verifies expected request formation and error handling.
-
cmd/operator/controller/start.go- Wire a new controller flag and pass controller-wide option through
to scheduled spark application reconciler options (
--scheduled-sa-timestamp-precision). - (Helm chart and values updated to provide default)
- Wire a new controller flag and pass controller-wide option through
to scheduled spark application reconciler options (
-
charts/spark-operator-chart/templates/controller/deployment.yaml- Map the new CLI argument to the container args.
-
charts/spark-operator-chart/values.yaml- Default values for the controller option
controller.scheduledSparkApplication.timestampPrecision.
- Default values for the controller option
-
Makefile- Proto generation targets and guidance;
proto-gentarget usesprotocandprotoc-gen-go/protoc-gen-go-grpc.
- Proto generation targets and guidance;
-
Tests
- Unit tests updated/added. Local test runs show controller-level tests passing for the modified packages.
Behavior & Backwards compatibility
- Default behavior is unchanged: controller uses the existing
SparkSubmitterwhich runsspark-submit. - The gRPC submitter is a new implementer of
SparkApplicationSubmitter. It can be used by configuring the controller to use it (future config/flag or DI). - Timestamp precision behavior for ScheduledSparkApplication is configurable via:
- CLI flag:
--scheduled-sa-timestamp-precision - Chart value:
controller.scheduledSparkApplication.timestampPrecision - Default remains
nanos.
- CLI flag:
How to generate proto stubs locally
From repo root:
# ensure bin dir for plugin binaries
export GOBIN=$(pwd)/bin
mkdir -p "$GOBIN"
export PATH="$GOBIN:$PATH"
# install protoc plugins (one-time)
go install google.golang.org/protobuf/cmd/[email protected]
go install google.golang.org/grpc/cmd/[email protected]
# run protoc (proto source is in proto/sparksubmit/)
protoc -I proto \
--go_out=proto --go_opt=paths=source_relative \
--go-grpc_out=proto --go-grpc_opt=paths=source_relative \
proto/sparksubmit/spark_submit.proto
Files changed
Added:
proto/sparksubmit/spark_submit.protoproto/sparksubmit/spark_submit.pb.go(generated)proto/sparksubmit/spark_submit_grpc.pb.go(generated)internal/controller/sparkapplication/grpc_submitter.gointernal/controller/sparkapplication/grpc_submitter_test.go
Modified:
Makefile(proto-gen bits)cmd/operator/controller/start.gocharts/spark-operator-chart/templates/controller/deployment.yamlcharts/spark-operator-chart/values.yamlgo.mod(if needed for generated packages)internal/controller/scheduledsparkapplication/controller_test.go(tests)
Fixed Issue
#2746
Testing and status
- Unit tests for
internal/controller/sparkapplicationpass locally in my environment. protocand go-based codegen used to generate stubs; go test passes for the updated packages.
Checklist (for reviewers)
- [ ] API (proto) looks reasonable (field names, message types).
- [ ] gRPC client usage is robust (timeouts, retries, error handling).
- [ ] Unit tests are sufficient to cover main code paths.
- [ ] Helm chart / CLI wiring is correct and default values sensible.
- [ ] Backwards compatibility verified (default uses existing submitter).
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign jacobsalway for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Hii @ChenYi015 could you please review it whenever you get a chance ??
Changes from #2742 has been included in this PR and some other PRs you have created, please do not mix up changes from different PRs.
We also need to discuss the implementation details and potential ramifications. Let's discuss in the issue before jumping into implementing it.
Changes from #2742 has been included in this PR and some other PRs you have created, please do not mix up changes from different PRs.
Yeah, I’ve been waiting for that PR to get merged, but it’s taking quite a long time. I tried rebasing and ran all the necessary commands to fix it, but it’s still pulling in commits from both branches. What should I do? Please help me out!!!
We also need to discuss the implementation details and potential ramifications. Let's discuss in the issue before jumping into implementing it.
Sure @nabuskey !! I’ll keep that in mind and make sure to discuss the implementation details in the issue before proceeding. If you prefer I can close this PR and we can first discuss the idea in the issue thread. Once we reach a conclusion, I’ll open a new PR based on that discussion. Does that sound good?