binderhub
binderhub copied to clipboard
`buildDockerConfig` is broken, `binder-build-docker-config` has multiple uses
This is a follow-up from https://github.com/jupyterhub/binderhub/pull/1592
c.BinderHub.build_docker_config isn't used by the BinderHub Python app, only it's presence is checked:
$ git grep build_docker_config origin/main
origin/main:binderhub/app.py: build_docker_config = Dict(
origin/main:binderhub/app.py: "build_docker_config": self.build_docker_config,
origin/main:binderhub/builder.py: if self.settings["use_registry"] or self.settings["build_docker_config"]:
Instead it's used in the Helm Chart secret:
https://github.com/jupyterhub/binderhub/blob/2ba938bf20201ff864b3a6c473f66f30fbef92bc/helm-chart/binderhub/templates/secret.yaml#L34-L42
but the secret refers to config.BinderHub.buildDockerConfig instead of config.BinderHub.build_docker_config.
Furthermore from @consideRatio https://github.com/jupyterhub/binderhub/blob/2ba938bf20201ff864b3a6c473f66f30fbef92bc/helm-chart/binderhub/templates/_helpers.tpl#L42-L45
This segment seems like a bug, or it is at least a notable issue because its simply separate from providing config.BinderHub.build_docker_config in values. Whats under .config should be pure passthrough - so buildDockerConfig won't map to build_docker_config.
This was originally added in https://github.com/jupyterhub/binderhub/pull/1255
Aside from it being broken, this is now a problem in https://github.com/jupyterhub/binderhub/pull/1521 where the Kubernetes config is being decoupled from the main BinderHub application, and there's a two-way dependency between BinderHub and the Helm Chart:
- BinderHub owns the config
- Helm Chart creates the static secret from that config
- BinderHub registry uses that secret to check the registry https://github.com/jupyterhub/binderhub/blob/8351995380a48ab1da096e02b2a1b93aab4bbc77/helm-chart/binderhub/templates/deployment.yaml#L68-L76 Though there's also a docker-socket mount here as an option
- BinderHub creates a build pod that requires that secret
A few potential options
Move buildDockerConfig to the top level of Values (.Values.buildDockerConfig), the Helm Chart has full ownership
This is the simplest option, and it also means BinderHub can use the same secret to query the registry. However if we want to have a variable Docker build config on a per-user or per-repo basis (see also https://github.com/jupyterhub/binderhub/pull/1169, https://github.com/jupyterhub/binderhub/pull/1577) this is not a good option.
Move buildDockerConfig to KubernetesExecutor.build_docker_config, and configure it with config.KubernetesExecutor.build_docker_config, The Helm chart creates a secret from this config
This is close to the current implementation, except that the property is owned by KubernetesExecutor instead of BinderHub. The secret is still created by the Helm Chart, and it also suffers the same limitations if we want per-user/per-repo config. It also means the BinderHub registry check depends on a KubernetesExecutor property.
Move buildDockerConfig to KubernetesExecutor.build_docker_config, KubernetesExecutor has full ownership including creating the secret
This is the most flexible option, and would allow us to have a variable Docker build config on a per-user or per-build basis in future. This is similar to what's being done in https://github.com/jupyterhub/binderhub/pull/1577 . However it means we need another way for BinderHub to check the registry. Either a second, possibly duplicate secret, though this could be a read-only registry token if you wanted to be really strict about security. Or move all registry/docker configuration to the Registry class, and have KubernetesExecutor depend on the Registry.
The third option is the most flexible, but we could also go with the first option and make a new breaking change when we're ready to introduce per-user/per-repo config.
While I like option number three best. I'd say we should evaluate the time frame for having dynamic credential support merged and what it requires for them to be used "everywhere".
AFAIK, there are three different use cases:
- Use of credentials for Git related work
- Use of credentials for registry access
- In the case of forges like GitLab push back to the same project as the code cloned
This also requires to decide how to configure whether static or dynamic credentials are going to be used when deploying and for which part of BinderHub it applies.
On the point of handling credentials, I think it's also important to document somewhere how to retrieve them. If memory serves well, It's not just a question of using e.g. GitLab for the login part. There's also configuration to be done in order for it to provide actual information in the auth_state.
I remember starting doing some work on number three last year however I can't put my hands on it now. From memory, it was not overly complicated. I think the main issue I had was to configure BinderHub's authentication to retrieve the token once logged in.
Found something for the documentation part, it's in oauthenticator: https://github.com/jupyterhub/oauthenticator/issues/372