kubespawner icon indicating copy to clipboard operation
kubespawner copied to clipboard

Add extra_services option

Open dolfinus opened this issue 3 years ago • 5 comments
trafficstars

KubeSpawner has services_enabled option which creates a service for the notebook pod. But it only creates service of type ClusterIP, and only opens 80 TCP port.

But sometimes there are running processes in the notebook pod which also need a separated service. For example:

  1. Apache Spark in k8 (client mode) - driver is running in the notebook container and opens several ports, which are then used by executor pods to communicate with a driver
  2. Apache Spark with driver in k8s and executors in YARN - same ports are used, but it is required to use NodePort instead of ClusterIP, allowing access from external network (executors running in YARN cluster) to a driver running in the pod
  3. Services running in a extra_containers of the same pod, which can listen for some port

Here I've added a new option KubeSpawner.extra_services where administrator can describe a list of services:

c.KubeSpawner.extra_services = [
  {
    "name": "jupyter-{username}-spark-driver--{servername}",
    "ports": [
      {
        "name": "driver",
        "port": 1000,
        "target_port": 1000,
      },
      {
        "name": "block-manager",
        "port": 1001,
        "target_port": 1001,
      },
      {
        "name": "ui",
        "port": 1002,
        "target_port": 1002,
      },
    ],
]

Just the same mechanism as in services_enabled is used - this list of services is created while spawning a new notebook, all the services are connected to a created pod using owner_references feature.

In addition to service name and port user can set service type, labels, annotations and extra spec options, like clusterIP.

Theoretically, values can be set dynamically using pre_spawn_hook - for example, in the case with Spark driver in k8s + executors in YARN pre spawn hook can generate random nodePort, pass it into a extra_services to create a NodePort service, and then pass the same port into an environment variable using extra_container_config.

Note: LoadBalancer service type is not tested is this PR as it requires a very complicated spec, and I don't see any use case for it. But theoretically, it is possible to use it.

dolfinus avatar Sep 29 '22 17:09 dolfinus

Thanks for submitting your first pull request! You are awesome! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly. welcome You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

welcome[bot] avatar Sep 29 '22 17:09 welcome[bot]

@GeorgianaElena Could you please review?

dolfinus avatar Oct 10 '22 08:10 dolfinus

I worry that adding another configuration like this goes beyond what we can maintain sustainably in the KubeSpawner project.

Is there reasonable workarounds to not embedding logic like this in KubeSpawner perhaps?

Ping @GeorgianaElena @manics @minrk for further opinions.

consideRatio avatar Oct 10 '22 09:10 consideRatio

@consideRatio I thought I'd commented but obviously hadn't.... I'm thinking exactly the same as you!

This might be a good time to consider how to bring in the extensibility mentioned in https://discourse.jupyter.org/t/jupyterhub-amalthea/12208 i.e. the ability to patch the singleuser server configuration into KubeSpawner

manics avatar Oct 10 '22 09:10 manics

modify_pod_hook hook cannot be used here because it is called before creating the pod, so owner_reference cannot be fetched. There could be some new hook after creating a pod, but I'm not sure is that okay or not

dolfinus avatar Oct 10 '22 09:10 dolfinus

Ok, implemented after_pod_created_hook in #644

dolfinus avatar Oct 12 '22 15:10 dolfinus

KubeSpawner has services_enabled option which creates a service for the notebook pod. But it only creates service of type ClusterIP, and only opens 80 TCP port.

How can I add another port when I use services_enabled in kubernetes?

ghost avatar Feb 27 '23 12:02 ghost

How can I add another port when I use services_enabled in kubernetes?

It turns out this is not possible

c.KubeSpawner.extra_services = [
  {
    "name": "jupyter-{username}-spark-driver--{servername}",
    "ports": [
      {
        "name": "driver",
        "port": 1000,
        "target_port": 1000,
      },
      {
        "name": "block-manager",
        "port": 1001,
        "target_port": 1001,
      },
      {
        "name": "ui",
        "port": 1002,
        "target_port": 1002,
      },
    ],
]

How to use the after_pod_created_hook to do the same thing?

ghost avatar Feb 28 '23 09:02 ghost

How to use the after_pod_created_hook to do the same thing?

make_service doesn't support multiple ports. How did you translate your extra_services example to after_pod_created_hook @dolfinus ? Did you call the make_service function multiple times to create a new service for every port?

https://github.com/jupyterhub/kubespawner/pull/644

ghost avatar Feb 28 '23 10:02 ghost

Did you call the make_service function multiple times to create a new service for every port?

I've created new service for all the ports I need using Kubernetes API.

dolfinus avatar Feb 28 '23 15:02 dolfinus

Thank you very much for your quick reply!

I've created new service for all the ports I need using Kubernetes API.

How did you call the Kubernetes API? This must go into kubespawner config somehow. Did you call make_service inside after_pod_created_hook multiple times to achieve that?

ghost avatar Feb 28 '23 15:02 ghost

I'm using a slightly modified version of kubespawner.spawner.get_service_manifest which calls kubespawner.objects.make_service with different port number and service name, and then call k8s api with a copy of this code: https://github.com/jupyterhub/kubespawner/blob/cd4c08d5e175e3b6d58e279c27265e6e95e6197b/kubespawner/spawner.py#L2741-L2757

dolfinus avatar Mar 01 '23 14:03 dolfinus

Sorry, I don't get it yet. For 2 ports per pod do you create 2 Kubernetes services per pod (and call make_service twice)?

From what I understand you probably use a customized version of kubespawner and this can not be achieved with after_pod_created_hook. Why did you close this extra_services PR?

ghost avatar Mar 01 '23 17:03 ghost

Yes, I create one more service to the same pod. Why not? These services are created with ower_reference, removing the pod automatically removes all the services. This is done by k8s itself, so I don't need to handle this event on KubeSpawner side.

dolfinus avatar Mar 01 '23 19:03 dolfinus

Thank you!

Yes, I create one more service to the same pod. Why not?

It worked! Unfortunately I need all ports to have the same service name. For that I need to patch kubespawner or?

ghost avatar Mar 02 '23 14:03 ghost

Probably

dolfinus avatar Mar 02 '23 19:03 dolfinus