docker-stacks
docker-stacks copied to clipboard
Move environment setup from `start-notebook.sh` to ENTRYPOINT instead of CMD
What docker images this feature is applicable to?
jupyter/base-notebook- Notebooks that use startup hooks to configure the environment
What changes do you propose?
Split start-notebook.sh or start.sh into a script that does the environment setup in an ENTRYPOINT, and a script that does the actual notebook startup in CMD.
Originally suggested in https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/2138#issuecomment-828291826
How does this change will affect users?
start-notebook.sh calls start.sh which handles a lot of setup in the Jupyter environment, including:
- Running setup hooks such as for setting environment variables https://github.com/jupyter/docker-stacks/blob/8dfdbfd3a3370ec138a7b85ae42422de8da3ca1b/base-notebook/start.sh#L44
- Customising user names and IDs when started as root https://github.com/jupyter/docker-stacks/blob/8dfdbfd3a3370ec138a7b85ae42422de8da3ca1b/base-notebook/start.sh#L46-L51
Since start-notebook.sh is set as the CMD if someone passes any arguments when running the Docker container all this setup is ignored. For example
docker run -e NB_UID=12345 -u 0 jupyter/base-notebook jupyter-lab --debug
should change the UID from the default 1000 to 12345, but since the startup scripts aren't run this leads to an error (can't be run as root). Instead you must run
docker run -e NB_UID=12345 -u 0 jupyter/base-notebook start.sh jupyter-lab --debug
A concrete example of where this is a problem for users is in the pyspark notebook- it isn't obvious to a user that the pyspark environment is setup by a startup script rather than being baked into the Dockerfile.
- https://github.com/jupyter/docker-stacks/blob/8dfdbfd3a3370ec138a7b85ae42422de8da3ca1b/pyspark-notebook/Dockerfile#L48
- https://discourse.jupyter.org/t/pyspark-library-is-missing-from-jupyter-pyspark-notebook-when-running-with-jupyterhub-zero-to-jupyterhub-k8s/8450
Note we're working around this in JupyterHub 2.0 and Z2JH 2.0 with a breaking change: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/2449
Instead of specifying jupyterhub-singleuser as the CMD when running the image we'll use the image's default CMD, but I think this change is still generally helpful.
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
https://discourse.jupyter.org/t/pyspark-library-is-missing-from-jupyter-pyspark-notebook-when-running-with-jupyterhub-zero-to-jupyterhub-k8s/8450/6
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
https://discourse.jupyter.org/t/cannot-use-sudo-have-root-access-using-jupyterhub-with-kubernetes/12548/5