jupyenv icon indicating copy to clipboard operation
jupyenv copied to clipboard

Using extraPackages override in template does not provide those packages in the python/jupyter environment.

Open djacu opened this issue 1 year ago • 3 comments

Describe the bug

One of the methods described in #281 for extending the available python kernel packages is to use the extraPackages attribute for the mkPoetryEnv function from poetry2nix. However, from my own and @GTrunSec's testing, we find that this does not provide the desired packages to the resulting jupyter/python environment and we are not able to import them.

To Reproduce

  1. Initialize a project file using the repo template with:

nix flake init --template github:tweag/jupyterWith/main

  1. Modify the flake.nix file inputs to reference the main branch. Temporary fix till main is merged to master or becomes the default branch.

inputs.jupyterWith.url = "github:tweag/jupyterWith/main";

  1. Modify the kernels/python.nix file to include additional packages by overriding the extraPackages attribute:
{
  pkgs,
  availableKernels,
  name,
}:
let
  python = availableKernels.python.override {
    extraPackages = ps: [ ps.docopt ps.numpy ];
  };
in
  python {
    displayName = "python with numpy";
  }
  1. Start the JupyterLab environment with nix run
  2. Try to import numpy or docopt and the import should fail in either case.

Expected behavior

The packages provided via extraPackages should be available and importable.

Environment

  • OS name + version: 22.11.20220816.762b003
  • Version of the code: e04ae77e2f26fcd6e6f0c5e0fdd4bba481c0f9c9

Additional context

Research

Building jupyterWith repo

Some further testing reveals a solution and hints as to the source of the problem. I suspected that the kernel being used is not the version of python specified in the kernelspec value argv. This turned out to be true.

First I tried using extraPackages attribute in the repo (not template) python kernel. I modified kernels/python/default.nix as follows:

{
  self,
  pkgs,
  # https://github.com/nix-community/poetry2nix#mkPoetryEnv
  projectDir ? self + "/kernels/python",
  pyproject ? projectDir + "/pyproject.toml",
  poetrylock ? projectDir + "/poetry.lock",
  overrides ? pkgs.poetry2nix.overrides.withDefaults (import ./overrides.nix),
  python ? pkgs.python3,
  editablePackageSources ? {},
  extraPackages ? ps: [ps.docopt ps.numpy],
  preferWheels ? false,
}: let

Note that extraPackages usually has a default of ps: [] and I have changed it to provide docopt and numpy by default.

I built the jupyter lab environment with the python kernel only with nix build .#jupyterlab-kernel-python and started the environment with ./result/bin/jupyter-lab. I was able to import both docopt and numpy.

The following checks that the python specified in the kernelspec matches JupyterLab.

$ cat ./result/bin/jupyter-lab
...
export JUPYTER_PATH='/nix/store/7xghzrbkk7gp9s16p0hlp36lpw3hlpiw-example_python-jupyter-kernel'
...
$ cat /nix/store/7xghzrbkk7gp9s16p0hlp36lpw3hlpiw-example_python-jupyter-kernel/kernels/example_python/kernel.json | jq
{
  "argv": [
    "/nix/store/v3hfg04jlg5q8r7ddfy2kf1ykgpxdriv-python3-3.10.5-env/bin/python",
    "-m",
    "ipykernel_launcher",
    "-f",
    "{connection_file}"
  ],
...

Entering the python environment specified above, /nix/store/v3hfg04jlg5q8r7ddfy2kf1ykgpxdriv-python3-3.10.5-env/bin/python, I was able to import both numpy and docopt.

In JupyterLab, I ran the following code in a cell and it verified that it was using the aforementioned version of python.

import sys
sys.executable

Building jupyterWith template

I tried the similar steps as mentioned previously. The difference being that I am using the template kernel and have modified it as shown previously in To Reproduce. The project is built with nix build .# and JupyterLab is started with ./result/bin/jupyter-lab. In this environment, I cannot import numpy or docopt.

Again checking the python used via the kernelspec.

$ cat ./result/bin/jupyter-lab
...
export JUPYTER_PATH='/nix/store/90smimg2238768pz8rx5zkcr62jw7217-python-jupyter-kernel'
...
$ cat /nix/store/90smimg2238768pz8rx5zkcr62jw7217-python-jupyter-kernel/kernels/python/kernel.json | jq
{
  "argv": [
    "/nix/store/v3hfg04jlg5q8r7ddfy2kf1ykgpxdriv-python3-3.10.5-env/bin/python",
    "-m",
    "ipykernel_launcher",
    "-f",
    "{connection_file}"
  ],

Entering the python environment shown in the kernelspec, /nix/store/v3hfg04jlg5q8r7ddfy2kf1ykgpxdriv-python3-3.10.5-env/bin/python, I can import both numpy and docopt.

In JupyterLab, I ran the following code in a cell and it showed a different python being used.

import sys
sys.executable
>>> /nix/store/ggcvj39h64157i1bijfl8njqb82j8ya8-python3-3.10.5-env/bin/python3.10

This version of python does not have numpy or docopt available.

Solution

I believe the problem is here in the name argument passed to the kernel specification. If I modified the jupyter kernel in the template flake the following, everything works as expected.

{
  pkgs,
  availableKernels,
  name,
}:
let
  python = availableKernels.python.override {
    extraPackages = ps: [ ps.docopt ps.numpy ];
  };
in
  python {
    name = "python-with-numpy";
    displayName = "python with numpy";
  }

From previous experimentation, I found that you cannot specify 2 kernels with the same name attribute. I believe that when we do not set the name attribute it somehow refers back to the original python kernel.

I am still pondering how to fix this in the nix code as we cannot expect end users to do this.

djacu avatar Sep 10 '22 06:09 djacu

The only meaningful documentation I could find on the kernelspec is in the jupyter-client documentation, but it does not mentioned anything about a name attribute.

djacu avatar Sep 10 '22 06:09 djacu

Also, because this is at the kernelspec level, I believe this would affect any kernel where the name attribute is not overridden.

djacu avatar Sep 10 '22 06:09 djacu

There is a name attribute of the KernelSpec class. https://github.com/jupyter/jupyter_client/blob/5c3afcf7480ec0aa91a32060aef7e75805e1ce97/jupyter_client/kernelspec.py#L34

It does appear to by used to identify if a kernel exists. https://jupyter-client.readthedocs.io/en/stable/api/kernelspec.html#jupyter_client.kernelspec.NoSuchKernel https://github.com/jupyter/jupyter_client/blob/5c3afcf7480ec0aa91a32060aef7e75805e1ce97/jupyter_client/kernelspec.py#L121

This appears to be the path that loads kernel.json files if kernel_name != "Python3" https://github.com/jupyter/jupyter_client/blob/5c3afcf7480ec0aa91a32060aef7e75805e1ce97/jupyter_client/kernelspec.py#L257 https://github.com/jupyter/jupyter_client/blob/5c3afcf7480ec0aa91a32060aef7e75805e1ce97/jupyter_client/kernelspec.py#L44

Still trying to figure out how name is used.

djacu avatar Sep 10 '22 17:09 djacu