kind-with-gpus-examples
kind-with-gpus-examples copied to clipboard
Cluster creation fails on WSL2
Hi,
I'm trying nvkind on WSL2, in which the nvidia driver is installed in Windows and exposed to the Linux VM automagically.
I followed all steps listed in the requirements section and they all succeed.
However, when I create a cluster using ./nvkind cluster create
, the cluster is created and post-processing steps installs packages. During this process, I encounter the following error:
<log truncated for readability>
Setting up nvidia-container-toolkit-base (1.16.0~rc.2-1) ...
Setting up libnvidia-container1:amd64 (1.16.0~rc.2-1) ...
Setting up libnvidia-container-tools (1.16.0~rc.2-1) ...
Setting up nvidia-container-toolkit (1.16.0~rc.2-1) ...
Processing triggers for libc-bin (2.36-9+deb12u4) ...
time="2024-08-02T10:58:37Z" level=info msg="Loading config from /etc/containerd/config.toml"
time="2024-08-02T10:58:37Z" level=info msg="Wrote updated config to /etc/containerd/config.toml"
time="2024-08-02T10:58:37Z" level=info msg="It is recommended that containerd daemon be restarted."
umount: /proc/driver/nvidia: not found
F0802 12:58:38.097622 29676 main.go:45] Error: patching /proc/driver/nvidia on node 'nvkind-mz6kz-worker': running script on nvkind-mz6kz-worker: executing command: exit status 1
I guess the nvidia driver works differently on WSL2 than on a regular Linux host, and therefore /proc/driver/nvidia
may not be present. How would I work around this issue?