team-compass icon indicating copy to clipboard operation
team-compass copied to clipboard

Consider moving the BinderHub helm chart + deployment docs into the JupyterHub helm chart

Open choldgraf opened this issue 7 years ago • 11 comments

For several reasons it would be useful long-term if there weren't two different helm charts (one for JupyterHub and one for BinderHub) as well as two totally separate sets of documentation, given that they overlap so much.

One idea that's been kicked around a bit is to treat a "BinderHub" as a particular flavor of a JupyterHub on Kubernetes, as opposed to its whole separate thing. Doing this would basically require two main things:

  1. Make "BinderHub functionality" possible via configuration in the JupyterHub helm chart
  2. Remove much of the "BinderHub install guide" docs into a section of the Z2JH guide

I think this could have a lot of benefits / improve the long-term sustainability of the codebases/documentation. It'd be great to chat about this further! Just opening this issue for conversation.

cc @yuvipanda as I think he has thoughts on this

choldgraf avatar Feb 02 '18 23:02 choldgraf

Back in the day, the binderhub chart (currently this subdirectory of the binderhub repo) used to live here, as a separate chart. It was moved over primarily because:

  1. BinderHub was moving far faster than z2jh at that time
  2. It was a separate, non-integrated chart that needed upkeep by itself.
  3. BinderHub was not a JupyterHub service, but a program by itself that did not talk to the JupyterHub API

This however, has come with negatives:

  1. Confusion around 'binderhub' vs 'jupyterhub', when in reality BinderHub is just a JupyterHub service, similar to how 'cull_idle_users' is a jupyterhub service
  2. We need to spend twice the effort in documenting JupyterHub here and BinderHub helm chart in the other repo, while in many cases they're pretty much the same

I propose that we do the following:

  1. Move the binderhub helm chart here, but
  2. Integrate the components into the jupyterhub helm chart we have here.

Ideally, the following snippet in your config.yaml would be all that is needed to enable a binder service (dynamic image building) for your jupyterhub:

binderhub:
   enabled: true

We can do this in ways that don't unduly increase the maintenance burden for us on deploying mybinder.org changes.

With this, we'll have:

  1. github.com/jupyterhub/binderhub -> contains binderhub code, similar to how github.com/jupyterhub/jupyterhub contains jupyterhub code
  2. This repo, which will contain a JupyterHub chart that has an option for enabling BinderHub

This also forces us (and makes it easier!) to have a release cadence for public binderhub chart based usage similar to what we have for the (more mature) z2jh JupyterHub chart.

yuvipanda avatar Feb 03 '18 04:02 yuvipanda

@yuvipanda do you have a feeling for whether this is a short/medium/long-term goal? In terms of how much effort it'd take to make this happen?

@mpacer this is the issue I mentioned earlier today, in case you have thoughts

choldgraf avatar Feb 06 '18 01:02 choldgraf

I think of this as something we could do for z2jh 0.7 or 0.8, so near-mid term.

yuvipanda avatar Feb 06 '18 23:02 yuvipanda

I am actually now convinced this is something we should do for z2jh 0.7, and so might start doing this soon (unless there are objections!). I'll plan on laying out a path forward here first in the next few days, and then go from there.

yuvipanda avatar Feb 07 '18 01:02 yuvipanda

Let's discuss this at the next team meeting. I'm supportive of simplifying and better maintainability. I do want to make sure that we're all on the same page and that we don't accidentally confuse ourselves, our users, or others in the wider Jupyter project.

Much like an incident report, it would be great to have simple template for larger changes such as this one where we can define the change, its impact, and our communication strategy to the larger Jupyter team and the community. It doesn't need to be a separate document. There's an example below that is my understanding of the above discussion. Feel free to edit it directly here. @choldgraf @yuvipanda


Proposal: Change location of BinderHub helm chart and documentation

Challenge

Two sets of documentation and helm charts (binderhub and jupyterhub) duplicates content.

Goals

  • Minimize duplicated content in code and documentation
  • Improve maintainability

Proposed changes

  • Treat binderhub as a service of jupyterhub
  • github.com/jupyterhub/binderhub repo contains binderhub code (similar to how github.com/jupyterhub/jupyterhub repo contains jupyterhub code)
  • The binderhub repo will contain a sample JupyterHub chart that has an option for enabling BinderHub (Question for clarification: Does or how does this JupyterHub chart differ from the released JupyterHub helm chart?)
    • the following snippet in your config.yaml would be all that is needed to enable a binder service (dynamic image building) for your jupyterhub:
    binderhub:
      enabled: true
    
  • Move binderhub documentation (binderhub repo) into Zero to JupyterHub documentation (z2jh repo)
    • add a new section: Activating the BinderHub Service
    • deprecate the existing Binderhub documentation with a redirection to the new section in z2jh documentation

Benefits

  • easier to have a release cadence for a public binderhub chart (similar to the release cadence of the more mature z2jh chart)
  • simplify maintenance through better encapsulation by treating binderhub as a service of jupyterhub
  • Simplified "one stop" documentation of Kubernetes based deployments of JupyterHub and BinderHub

Proposed timeline and actions

  • Timing: ZeroToJupyterHub Helm Chart 0.7 (next release)
  • Release of BinderHub 0.1 Helm Chart
  • Announce documentation changes and repo changes to Jupyter mailing list and binder-dev mailing list

willingc avatar Mar 07 '18 20:03 willingc

I'll close this as we've had several informal discussions on BinderHub vs Z2JH and other decoupling.

manics avatar Sep 20 '21 18:09 manics

@manics do we have a temperature of the room in terms of how people feel about this? I know it's been discussed a bunch of times but I'm not sure whether people are like "net positive" or "net negative" or "I'm not sure without more concrete proposals/data"?

I agree with @willingc's suggestion above that this issue probably warrants a more fleshed out proposal / discussion / decision before committing to doing this in the project. Maybe a next step is to create a proposal with more "meat on the bone" for discussion, or prototypes to play with, and then having a team discussion around it in a new issue.

choldgraf avatar Sep 21 '21 17:09 choldgraf

@choldgraf I don't have a feeling. We've had several other related discussions though, so I thought I'd close this as I thought it was out of date. I'm happy for it to be re-opened though.

manics avatar Sep 21 '21 17:09 manics

@manics @choldgraf Maybe transfer the issue to the team compass repo for further discussion in a meeting.

willingc avatar Sep 21 '21 22:09 willingc

Another idea: can we move to a model where you setup a JupyterHub and then deploy (one or more) extra services that use that JH? For BinderHub this would mean that the instructions would be "Use z2jh to deploy a JH, then deploy the BinderHub service (via its own chart)".

What are other JH services and how do they get deployed/added to a JH? Do we want to have all possible JH services controlled from the Z2JH chart? Can we model different services like plugins (not all controlled from a central repo)?

Right now the z2jh chart is a dependency of the BH chart. So if you already have a JupyterHub you can't add the BH service to it. Is this a problem that would be fixed/to which people desire a fix?

The writing I've read (from people outside the JH team) that dreams of a "Hub unification" seems to be based on the misunderstanding of what BH is. This means it is tricky to figure out what it is that people are desiring when they wish for a hub unification.

I understand what people desire from a "I am a *Hub developer" point of view, but I am wondering what features/benefits the "*Hub users" are looking for. Getting a better understanding of those desires would be useful in order to decide what to do next. Without that we risk making a (big) move that still results in people being unhappy.

One thing I have found myself wanting, as a *Hub user, is being able to deploy one JH that is used like a normal JH by some users and then reuse that JH to also power a BH (that is deployed on the same cluster). Mostly because it is more work to deploy and maintain one JH for the "JH as normal" use case and then a second one for the "JH for BH" use case. However an upside has been that the configuration and operation of these two hubs is totally separate (I can upgrade them separately, changing settings for one doesn't touch the other).

TL;DR: I like the idea of thinking of adding services to a JH as "installing a plugin/extension" and I don't have a good idea what drives the desire of hub users for "hub unification".

betatim avatar Sep 23 '21 11:09 betatim

fwiw, I love @willingc's proposal in https://github.com/jupyterhub/team-compass/issues/452#issuecomment-371267846. At that time I had burnt out and toned down my involvement for a while, but I still strongly believe this is the path forward for better adoption of binderhub, which ultimately is something necessary for less stressful maintenance too.

yuvipanda avatar Oct 03 '23 23:10 yuvipanda