kubespawner
kubespawner copied to clipboard
[Feature] Let singleuser server select a free random port to listen on
About
This PR adds the functionality to let the singleuser server/pod choose the port it's listening on itself.
For this, you need to have the jupyterhub-kubespawner package installed in both, the JupyterHub Server and the Singleuser Server image.
How it works
Currently, it works by setting the kubespawner/port: auto annotation on the profile list. I guess it would be way better to have it as a real config option for kubespawner or be assumed when setting port = 0.
It works almost exactly like the similar change in batchspawner (see https://github.com/jupyterhub/batchspawner/pull/58 & https://github.com/jupyterhub/batchspawner/pull/130).
Example Use Case
Our use case is to be able to connect to a remote Hadoop cluster, executing pySpark in client mode (so the Hadoop nodes can talk back to a real IP and not to the non-routeable Pod-IP). In this case, the singleuser pods run in hostNetwork: true mode.
Without this feature, the default ports would collide if more than one singleuser server is being scheduled to the same node.
We're using this in production for several months now, without any issues so far.
Related Issues
This could fix #299
Thanks for submitting your first pull request! You are awesome! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly.
You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:
Before you spend too much time on this (I haven't read it yet), you might want to look at:
- Something that already exists in batchspawner that selects port on remote side and communicates it to JH Implemented in https://github.com/jupyterhub/batchspawner/pull/58
- Since this would be generally useful, we thought to add it to core JupyterHub. There are some notes in a PR, but it is not working yet (and I doubt I'll have time to do much more): https://github.com/jupyterhub/jupyterhub/pull/2727
Perhaps you'd like to work on the jupyterhub improvement some, then hook into that?
Note: I know some of batchspawner, less involved in both kubespawner and jupyterhub. I hope someone else more knowledgeable will comment on which options they recommend you to look into.
By all means, take over the pull request jupyterhub/jupyterhub#2727 that I started! It was designed in consultation with the JH team.
Hi @rkdarst , thanks for your feedback :+1: Actually https://github.com/jupyterhub/batchspawner/pull/58 is what inspired me to do this like it is now. I love the idea of having a generic callback endpoint for spawners to submit arbitrary data :thinking: Is it foreseeable to be merged soon-ish?
I love the idea of having a generic callback endpoint for spawners to submit arbitrary data :thinking: Is it foreseeable to be merged soon-ish?
I think yes, if someone works on it. @minrk already went over the concepts, which is the important thing.
If someone (you?) can work on it, I could see it getting done soon. Like I think I said, I might not get back to it any time, but from this PR you might be a good one to push it forward... is there anything I could help with?
I'm confused why there are so many commits unrelated to this PR. Can you rebase on master so I better understand whats really relevant to review @iwilltry42?
Hi @consideRatio , sorry for the commit mess.. don't know what happened there when merging master. I now rebased onto master, but couldn't get rid of that one commit in travis (although I didn't change anything there).. However, right now that's only a newline :man_shrugging:
@rkdarst , I'm trying to understand https://github.com/jupyterhub/jupyterhub/pull/2727 now, however I feel like I'm lacking a clear specification there and am not sure if I can just pick it up on the fly.