Pull Request Template for Kubeflow Manifests

✏️ Summary of Changes

Add Support for notebooks/spark operator to manifests

This is a work in progress to automate the installation of Spark operator integration with notebooks. We are currently having issues when starting the kernel, where it needs to communicate back to the enterprise gateway, but it silently fails. I believe it's related to isito and would appreciate some help.

Connection Flow

Kernel starts and encrypts its connection details. Kernel sends those details back to Enterprise Gateway. Enterprise Gateway decrypts and reads the info. Gateway passes the connection info to the kernel’s proxy. Gateway uses that info to connect to the kernel’s ports (shell, iopub, stdin, heartbeat, control).

✅ Contributor Checklist

[ ] I have tested these changes with kustomize. See Installation Prerequisites.
[X] All commits are signed-off to satisfy the DCO check.
[ ] I have considered adding my company to the adopters page to support Kubeflow and help the community, since I expect help from the community for my issue (see 1. and 2.).

Aug 21 '25 05:08 fresende

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign juliusvonkohout for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

applications/spark/OWNERS

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Aug 21 '25 05:08 google-oss-prow[bot]

Why do you not use the modern standard spark-connect with interactive session support instead of the enterprise gateway?

Aug 21 '25 21:08 juliusvonkohout

Why do you not use the modern standard spark-connect with interactive session support instead of the enterprise gateway?

I think spark connect doesn't cover the full Spark API or multi user isolation needs compared to JEG.

Aug 22 '25 09:08 tarekabouzeid

Why do you not use the modern standard spark-connect with interactive session support instead of the enterprise gateway?

I think spark connect doesn't cover the full Spark API or multi user isolation needs compared to JEG.

We isolate per namespace, so why is spark-connect not multi-tenant? CC @vikas-saxena02

Aug 22 '25 10:08 juliusvonkohout

Why do you not use the modern standard spark-connect with interactive session support instead of the enterprise gateway?

I think spark connect doesn't cover the full Spark API or multi user isolation needs compared to JEG.

We isolate per namespace, so why is spark-connect not multi-tenant? CC @vikas-saxena02

If spark connect is installed per namespace, then yes. But i am not 100% sure if different notebook kernels can then have different spark drivers.

That will be nice to investigate it further. Maybe @fresende have already looked into.

Aug 22 '25 11:08 tarekabouzeid

The goal is to have spark separated per namespace. so spark-cluster and spark-connect is deployed per namespace and only the Jupyterlabs in the namespace can access it.

Aug 22 '25 11:08 juliusvonkohout

I have experimented with deploying spark-connect deployed separately as a service.. I could do it on a namespace level. But I advent reis it with the new spark-connect crd which was added recently.

Thanks and regards, Vikas Saxena.

On Fri, 22 Aug 2025, 9:55 pm Julius von Kohout, @.***> wrote:

juliusvonkohout left a comment (kubeflow/manifests#3223) https://github.com/kubeflow/manifests/pull/3223#issuecomment-3214108302

The goal is to have spark separated per namespace. so spark-cluster and spark-connect is deployed per namespace and only the Jupyterlabs in the namespace can access it.

— Reply to this email directly, view it on GitHub https://github.com/kubeflow/manifests/pull/3223#issuecomment-3214108302, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVSEDXRQCGS3756IR6IFZQD3O4AL3AVCNFSM6AAAAACENQPRMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTEMJUGEYDQMZQGI . You are receiving this because you were mentioned.Message ID: @.***>

Aug 22 '25 12:08 vikas-saxena02

I have experimented with deploying spark-connect deployed separately as a service.. I could do it on a namespace level. But I advent reis it with the new spark-connect crd which was added recently. Thanks and regards, Vikas Saxena. … On Fri, 22 Aug 2025, 9:55 pm Julius von Kohout, @.> wrote: juliusvonkohout left a comment (kubeflow/manifests#3223) <#3223 (comment)> The goal is to have spark separated per namespace. so spark-cluster and spark-connect is deployed per namespace and only the Jupyterlabs in the namespace can access it. — Reply to this email directly, view it on GitHub <#3223 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVSEDXRQCGS3756IR6IFZQD3O4AL3AVCNFSM6AAAAACENQPRMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTEMJUGEYDQMZQGI . You are receiving this because you were mentioned.Message ID: @.>

Do you mean for each notebook in that namespace will have its own spark connect and spark application?

Aug 24 '25 15:08 tarekabouzeid

The classic Jupyter + IPython kernel approach with PySpark would allow each user to run their own Spark driver instance, giving them full control over configuration, context, and resource usage without interference from others. It’s a mature, well-proven model that supports the entire PySpark API surface, including advanced features and low-level tuning not yet available through Spark Connect. This setup ensures predictable behavior, easier debugging, and maximum compatibility with existing Spark workflows and Jupyter integrations.

SparkConnect, on the other hand (which I believe came to replace Apache Livy) has its merit and its very good for providing a shared Spark as a service.

We can definitely continue investigating integration with Spark Connect further, but I believe it should happen in parallel as they will probably be used for different user cases.

Aug 24 '25 22:08 lresende

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Nov 04 '25 00:11 github-actions[bot]

I think i need to dive into https://www.kubeflow.org/docs/components/spark-operator/user-guide/notebooks-spark-operator/ first.

Nov 17 '25 10:11 juliusvonkohout

[WIP] Add Support for notebooks/spark operator to manifests

Pull Request Template for Kubeflow Manifests

✏️ Summary of Changes

✅ Contributor Checklist