binderhub icon indicating copy to clipboard operation
binderhub copied to clipboard

Support running without kubernetes, just docker

Open yuvipanda opened this issue 4 years ago • 13 comments

Proposed change

There's nothing inherent about binderhub that requires kubernetes. What it needs really is:

  1. A way to spawn repo2docker
  2. A way to tell a JupyterHub to run users with a specific image

This is doable without kubernetes! If you look at where we use kubernetes, it is mostly only in build.py. We already support another build backend - FakeBuild.

I'd like to support an additional backend that lets you run binderhub on a single host, with docker + a jupyterhub with dockerspawner. This will make it much easier for people to set up 'small' binderhubs internally, and also exercise our abstraction for future users (who might want it on HPCs, for example).

Alternative options

Decide that we have a hard dependency on kubernetes, and other setups will not be supported. I don't think we should do this though.

Who would use this feature?

  1. People who want to run binderhub on a single node, since they don't expect a lot of traffic (or it is authenticated)
  2. People who want to code for different backends eventually. This would also involve finishing up @manics's work in https://github.com/jupyterhub/repo2docker/pull/848 in repo2docker, but that can happen independent of work here.

(Optional): Suggest a solution

  1. Turn Build in https://github.com/jupyterhub/binderhub/blob/00bd8304e99e9a4edbdfcca6fd98293fb1fb2e6a/binderhub/build.py#L18 into an abstract, well defined interface. I think this involves definint __init__, submit and stream_logs
  2. Implement a DockerBuild that implements Build on top of the docker API
  3. Investigate features we need that are deployed as part of the helm chart, rather than in the python package. Refactor as needed (keeping in mind needs of simplicity for mybinder.org)
  4. Move our kubernetes imports to a file, and make it conditional. This helps us make kubernetes an optional dependency!
  5. Add tests to make sure things work with just docker!

yuvipanda avatar Jun 30 '21 04:06 yuvipanda

Collecting other issues that propose the same idea.

https://github.com/jupyterhub/binderhub/issues/1061

https://discourse.jupyter.org/t/creating-a-new-binder-at-home-tool/ https://discourse.jupyter.org/t/binderhub-for-hpc/

betatim avatar Jun 30 '21 06:06 betatim

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/building-a-the-littlest-binderhub/9824/1

meeseeksmachine avatar Jul 01 '21 13:07 meeseeksmachine

Reposting from gitter: I think this makes a lot of sense in principle, but it concerns me a bit to expand the scope when we aren't keeping up with maintenance already. At the same time, any enthusiasm to contribute is beneficial and can help increase our maintenance support capacity.

Counterpoint to myself: if we can run BinderHub with just docker, it will make local testing of most of the logic way easier, maybe improving maintenance of the whole thing.

minrk avatar Jul 02 '21 08:07 minrk

Counterpoint to myself: if we can run BinderHub with just docker, it will make local testing of most of the logic way easier, maybe improving maintenance of the whole thing.

+1

We can also put the docker logic in a separate package, so we end up in a situation like we have with spawners. I use SimpleLocalProcessSpawner for testing, and that's so much nicer than requiring k8s for it.

yuvipanda avatar Jul 02 '21 08:07 yuvipanda

:+1: to putting the Docker logic in a separate package. I think just the act of defining interfaces (e.g. BinderBuilder and BinderLauncher?) will improve maintainability of the code through clear documentation, at the moment there's an assumption you know a lot of Kubernetes concepts which can make the code hard to understand.

manics avatar Jul 02 '21 17:07 manics

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/do-i-have-to-use-a-repository-with-binderhub-if-not-how-do-i-locally-run-them/9886/2

meeseeksmachine avatar Jul 07 '21 22:07 meeseeksmachine

WIP: https://github.com/jupyterhub/binderhub/pull/1353 but it was easier than expected to get a demo working :smiley_cat:

manics avatar Sep 05 '21 17:09 manics

@manics has made amazing progress on this, and it's possible :)

I think there are a few more things we need to clean up.

  • [ ] Make our Quota system generic too. Right now it directly talks to k8s API and isn't subclassable.
  • [ ] Move the Kubernetes build to a KubernetesBuild class, rather than current Build class
  • [ ] Put k8s specific settings (like the api class, affinities, etc, etc) as traitlets on the KubernetesBuild class. Admins should be able to configure things there directly.
  • [ ] Figure out what other knobs to expose as traitlets from our Build class. Right now it's not a Traitlet.

yuvipanda avatar Sep 20 '21 10:09 yuvipanda

It'd be nice if we could get most tests passing with both K8S and Docker since that reduces the barrier to contributing and testing PRs, and add a new pytest.mark.k8s for tests that only run with K8s.

manics avatar Sep 20 '21 16:09 manics

This is neat. Is it be possible for the build to work with podman as well as Docker?

rcthomas avatar Sep 20 '21 19:09 rcthomas

@rcthomas It is :smile: See https://github.com/jupyterhub/binderhub/compare/master...manics:local-binder-local-hub-podman?expand=1 I'll probably move that to a gist, or perhaps a deployment examples repo, at some point.

manics avatar Sep 20 '21 19:09 manics

Oh maybe you mentioned that during the demo, sorry if I missed it!

rcthomas avatar Sep 20 '21 20:09 rcthomas

I mentioned in passing that I'd gotten it working with Podman but didn't give any details. Still proof-of-concept rather than something you'd want in production, but fun to play with locally!

manics avatar Sep 20 '21 20:09 manics