kubespawner
kubespawner copied to clipboard
Add extra_services option
KubeSpawner has services_enabled option which creates a service for the notebook pod. But it only creates service of type ClusterIP, and only opens 80 TCP port.
But sometimes there are running processes in the notebook pod which also need a separated service. For example:
- Apache Spark in k8 (client mode) - driver is running in the notebook container and opens several ports, which are then used by executor pods to communicate with a driver
- Apache Spark with driver in k8s and executors in YARN - same ports are used, but it is required to use
NodePortinstead ofClusterIP, allowing access from external network (executors running in YARN cluster) to a driver running in the pod - Services running in a
extra_containersof the same pod, which can listen for some port
Here I've added a new option KubeSpawner.extra_services where administrator can describe a list of services:
c.KubeSpawner.extra_services = [
{
"name": "jupyter-{username}-spark-driver--{servername}",
"ports": [
{
"name": "driver",
"port": 1000,
"target_port": 1000,
},
{
"name": "block-manager",
"port": 1001,
"target_port": 1001,
},
{
"name": "ui",
"port": 1002,
"target_port": 1002,
},
],
]
Just the same mechanism as in services_enabled is used - this list of services is created while spawning a new notebook, all the services are connected to a created pod using owner_references feature.
In addition to service name and port user can set service type, labels, annotations and extra spec options, like clusterIP.
Theoretically, values can be set dynamically using pre_spawn_hook - for example, in the case with Spark driver in k8s + executors in YARN pre spawn hook can generate random nodePort, pass it into a extra_services to create a NodePort service, and then pass the same port into an environment variable using extra_container_config.
Note: LoadBalancer service type is not tested is this PR as it requires a very complicated spec, and I don't see any use case for it. But theoretically, it is possible to use it.
Thanks for submitting your first pull request! You are awesome! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly.
You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:
@GeorgianaElena Could you please review?
I worry that adding another configuration like this goes beyond what we can maintain sustainably in the KubeSpawner project.
Is there reasonable workarounds to not embedding logic like this in KubeSpawner perhaps?
Ping @GeorgianaElena @manics @minrk for further opinions.
@consideRatio I thought I'd commented but obviously hadn't.... I'm thinking exactly the same as you!
This might be a good time to consider how to bring in the extensibility mentioned in https://discourse.jupyter.org/t/jupyterhub-amalthea/12208 i.e. the ability to patch the singleuser server configuration into KubeSpawner
modify_pod_hook hook cannot be used here because it is called before creating the pod, so owner_reference cannot be fetched. There could be some new hook after creating a pod, but I'm not sure is that okay or not
Ok, implemented after_pod_created_hook in #644
KubeSpawner has services_enabled option which creates a service for the notebook pod. But it only creates service of type ClusterIP, and only opens 80 TCP port.
How can I add another port when I use services_enabled in kubernetes?
How can I add another port when I use services_enabled in kubernetes?
It turns out this is not possible
c.KubeSpawner.extra_services = [
{
"name": "jupyter-{username}-spark-driver--{servername}",
"ports": [
{
"name": "driver",
"port": 1000,
"target_port": 1000,
},
{
"name": "block-manager",
"port": 1001,
"target_port": 1001,
},
{
"name": "ui",
"port": 1002,
"target_port": 1002,
},
],
]
How to use the after_pod_created_hook to do the same thing?
How to use the after_pod_created_hook to do the same thing?
make_service doesn't support multiple ports.
How did you translate your extra_services example to after_pod_created_hook @dolfinus ? Did you call the make_service function multiple times to create a new service for every port?
https://github.com/jupyterhub/kubespawner/pull/644
Did you call the make_service function multiple times to create a new service for every port?
I've created new service for all the ports I need using Kubernetes API.
Thank you very much for your quick reply!
I've created new service for all the ports I need using Kubernetes API.
How did you call the Kubernetes API? This must go into kubespawner config somehow. Did you call make_service inside after_pod_created_hook multiple times to achieve that?
I'm using a slightly modified version of kubespawner.spawner.get_service_manifest which calls kubespawner.objects.make_service with different port number and service name, and then call k8s api with a copy of this code:
https://github.com/jupyterhub/kubespawner/blob/cd4c08d5e175e3b6d58e279c27265e6e95e6197b/kubespawner/spawner.py#L2741-L2757
Sorry, I don't get it yet. For 2 ports per pod do you create 2 Kubernetes services per pod (and call make_service twice)?
From what I understand you probably use a customized version of kubespawner and this can not be achieved with after_pod_created_hook. Why did you close this extra_services PR?
Yes, I create one more service to the same pod. Why not? These services are created with ower_reference, removing the pod automatically removes all the services. This is done by k8s itself, so I don't need to handle this event on KubeSpawner side.
Thank you!
Yes, I create one more service to the same pod. Why not?
It worked! Unfortunately I need all ports to have the same service name. For that I need to patch kubespawner or?
Probably