zero-to-jupyterhub-k8s icon indicating copy to clipboard operation
zero-to-jupyterhub-k8s copied to clipboard

Clarify in documentation that `pip install` won't be permanent if its not installing in the home folder that is persisted

Open aloifolia opened this issue 3 years ago • 2 comments

I am not sure whether my use case is actually not documented anywhere or whether I am just incapable of finding anything upon it: On my jupyterhub user pod in k8s I used the magic "%pip" command to install additional packages. Then I wondered if this installation was persistent or would disappear after a server restart. As it turns out the packages are not permanent.

Proposed change

The documentation should highlight this feature. What is the recommended way for user-specific package installations? Should they be reinstalled after every server restart? Should they be avoided altogether?

Alternative options

Ideally, there is some way to make package installations permanent via volume claims (or maybe this is already possible).

aloifolia avatar Nov 05 '21 15:11 aloifolia

It is in scope to help clarify that anything installed outside a users home folder, will be temporary due to the nature of how user servers are running - in docker containers with a mounted persistent storage for the home folder specifically.

I'll let this issue represent the need to clarify that a bit better.

Regarding providing a "recommended way", I'd say there isn't one clear recommended way, this is a complicated matter and people have solved it in many different ways. We can mention that there isn't one clear recommended way as well, and reference solutions like:

  • (common) Update the docker image when you want something new in the environment for everyone
  • (common) Do pip install --user <package> with pip to install something locally
  • Do conda create of environments so they are located locally in the home folder and get an entirely separate environment from the other environment, then have nb_conda_kernels installed in the base environment to make it show among the available kernels notebooks can execute against - assuming it includes ipykernel or irkernel.

consideRatio avatar Nov 05 '21 16:11 consideRatio

I will also add there are environment variables you can use for caching pip and/or conda. In my case I have a shared PVC mounted to /mnt/shared, and nb_conda_kernels in my base image. That notably lets you share environments, but you need to use something like conda create --clone A --name B to avoid polluting your environments. You can combine this with version control / some system like pre-commit to always export and track your environment.

extraEnv:
  CONDA_ENVS_DIRS: /mnt/shared/conda_envs
  CONDA_PKGS_DIRS: /mnt/shared/conda_pkgs
  PIP_CACHE_DIR: /mnt/shared/pip_cache

cyrilcros avatar Jan 03 '22 22:01 cyrilcros