docker-stacks icon indicating copy to clipboard operation
docker-stacks copied to clipboard

Move environment setup from `start-notebook.sh` to ENTRYPOINT instead of CMD

Open manics opened this issue 4 years ago • 2 comments

What docker images this feature is applicable to?

  • jupyter/base-notebook
  • Notebooks that use startup hooks to configure the environment

What changes do you propose?

Split start-notebook.sh or start.sh into a script that does the environment setup in an ENTRYPOINT, and a script that does the actual notebook startup in CMD.

Originally suggested in https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/2138#issuecomment-828291826

How does this change will affect users?

start-notebook.sh calls start.sh which handles a lot of setup in the Jupyter environment, including:

  • Running setup hooks such as for setting environment variables https://github.com/jupyter/docker-stacks/blob/8dfdbfd3a3370ec138a7b85ae42422de8da3ca1b/base-notebook/start.sh#L44
  • Customising user names and IDs when started as root https://github.com/jupyter/docker-stacks/blob/8dfdbfd3a3370ec138a7b85ae42422de8da3ca1b/base-notebook/start.sh#L46-L51

Since start-notebook.sh is set as the CMD if someone passes any arguments when running the Docker container all this setup is ignored. For example docker run -e NB_UID=12345 -u 0 jupyter/base-notebook jupyter-lab --debug should change the UID from the default 1000 to 12345, but since the startup scripts aren't run this leads to an error (can't be run as root). Instead you must run docker run -e NB_UID=12345 -u 0 jupyter/base-notebook start.sh jupyter-lab --debug

A concrete example of where this is a problem for users is in the pyspark notebook- it isn't obvious to a user that the pyspark environment is setup by a startup script rather than being baked into the Dockerfile.

  • https://github.com/jupyter/docker-stacks/blob/8dfdbfd3a3370ec138a7b85ae42422de8da3ca1b/pyspark-notebook/Dockerfile#L48
  • https://discourse.jupyter.org/t/pyspark-library-is-missing-from-jupyter-pyspark-notebook-when-running-with-jupyterhub-zero-to-jupyterhub-k8s/8450

Note we're working around this in JupyterHub 2.0 and Z2JH 2.0 with a breaking change: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/2449 Instead of specifying jupyterhub-singleuser as the CMD when running the image we'll use the image's default CMD, but I think this change is still generally helpful.

manics avatar Nov 12 '21 13:11 manics

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/pyspark-library-is-missing-from-jupyter-pyspark-notebook-when-running-with-jupyterhub-zero-to-jupyterhub-k8s/8450/6

meeseeksmachine avatar Nov 12 '21 14:11 meeseeksmachine

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/cannot-use-sudo-have-root-access-using-jupyterhub-with-kubernetes/12548/5

meeseeksmachine avatar Feb 08 '22 21:02 meeseeksmachine