zero-to-jupyterhub-k8s
zero-to-jupyterhub-k8s copied to clipboard
Clarify in documentation that `pip install` won't be permanent if its not installing in the home folder that is persisted
I am not sure whether my use case is actually not documented anywhere or whether I am just incapable of finding anything upon it: On my jupyterhub user pod in k8s I used the magic "%pip" command to install additional packages. Then I wondered if this installation was persistent or would disappear after a server restart. As it turns out the packages are not permanent.
Proposed change
The documentation should highlight this feature. What is the recommended way for user-specific package installations? Should they be reinstalled after every server restart? Should they be avoided altogether?
Alternative options
Ideally, there is some way to make package installations permanent via volume claims (or maybe this is already possible).
It is in scope to help clarify that anything installed outside a users home folder, will be temporary due to the nature of how user servers are running - in docker containers with a mounted persistent storage for the home folder specifically.
I'll let this issue represent the need to clarify that a bit better.
Regarding providing a "recommended way", I'd say there isn't one clear recommended way, this is a complicated matter and people have solved it in many different ways. We can mention that there isn't one clear recommended way as well, and reference solutions like:
- (common) Update the docker image when you want something new in the environment for everyone
- (common) Do
pip install --user <package>
with pip to install something locally - Do
conda create
of environments so they are located locally in the home folder and get an entirely separate environment from the other environment, then havenb_conda_kernels
installed in the base environment to make it show among the available kernels notebooks can execute against - assuming it includesipykernel
orirkernel
.
I will also add there are environment variables you can use for caching pip and/or conda. In my case I have a shared PVC mounted to /mnt/shared
, and nb_conda_kernels
in my base image. That notably lets you share environments, but you need to use something like conda create --clone A --name B
to avoid polluting your environments. You can combine this with version control / some system like pre-commit
to always export and track your environment.
extraEnv:
CONDA_ENVS_DIRS: /mnt/shared/conda_envs
CONDA_PKGS_DIRS: /mnt/shared/conda_pkgs
PIP_CACHE_DIR: /mnt/shared/pip_cache