nvidia-docker icon indicating copy to clipboard operation
nvidia-docker copied to clipboard

Nvidia driver not detected on WSL2

Open boyang9602 opened this issue 3 years ago • 4 comments

1. Issue or feature description

I'm trying to use the Nvidia docker on WSL 2. I installed the driver on the host, and followed this guide to install the nividia-docker2.

When I tried docker run --gpus all -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tensorflow:20.03-tf2-py3

The output is not as expected:

================
== TensorFlow ==
================

NVIDIA Release 20.03-tf2 (build 11026100)
TensorFlow Version 2.1.0

Container image Copyright (c) 2019, NVIDIA CORPORATION.  All rights reserved.
Copyright 2017-2019 The TensorFlow Authors.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.

WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
   Use 'nvidia-docker run' to start this container; see
   https://github.com/NVIDIA/nvidia-docker/wiki/nvidia-docker .

NOTE: MOFED driver for multi-node communication was not detected.
      Multi-node communication performance may be reduced.

When I try nvidia-docker run --gpus all -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tensorflow:20.03-tf2-py3, the output is

docker: Error response from daemon: Unknown runtime specified nvidia.
See 'docker run --help

I tried modprobe nvidia, the output is modprobe: FATAL: Module nvidia not found in directory /lib/modules/4.19.128-microsoft-standard.

3. Information to attach (optional if deemed irrelevant)

  • [ ] Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
$ nvidia-container-cli -k -d /dev/tty info

-- WARNING, the following logs are for debugging purposes only --

I0203 18:12:33.732533 1637 nvc.c:376] initializing library context (version=1.8.0~rc.2, build=d48f9b0d505fca0aff7c88cee790f9c56aa1b851)
I0203 18:12:33.732591 1637 nvc.c:350] using root /
I0203 18:12:33.732597 1637 nvc.c:351] using ldcache /etc/ld.so.cache
I0203 18:12:33.732600 1637 nvc.c:352] using unprivileged user 1000:1000
I0203 18:12:33.732620 1637 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0203 18:12:33.749801 1637 dxcore.c:227] Creating a new WDDM Adapter for hAdapter:40000000 luid:356cc6
I0203 18:12:33.756366 1637 dxcore.c:210] Core Nvidia component libcuda.so.1.1 not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_4be767c332df1d04
I0203 18:12:33.756975 1637 dxcore.c:210] Core Nvidia component libcuda_loader.so not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_4be767c332df1d04
I0203 18:12:33.757599 1637 dxcore.c:210] Core Nvidia component libnvidia-ptxjitcompiler.so.1 not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_4be767c332df1d04
I0203 18:12:33.758180 1637 dxcore.c:210] Core Nvidia component libnvidia-ml.so.1 not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_4be767c332df1d04
I0203 18:12:33.758826 1637 dxcore.c:210] Core Nvidia component libnvidia-ml_loader.so not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_4be767c332df1d04
I0203 18:12:33.759386 1637 dxcore.c:210] Core Nvidia component nvidia-smi not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_4be767c332df1d04
I0203 18:12:33.759410 1637 dxcore.c:215] No Nvidia component found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_4be767c332df1d04
E0203 18:12:33.759431 1637 dxcore.c:261] Failed to query the core Nvidia libraries for the adapter. Skipping it.
I0203 18:12:33.759451 1637 dxcore.c:227] Creating a new WDDM Adapter for hAdapter:40000040 luid:356dcf
I0203 18:12:33.765143 1637 dxcore.c:268] Adding new adapter via dxcore hAdapter:40000040 luid:356dcf wddm version:3000
I0203 18:12:33.765181 1637 dxcore.c:326] dxcore layer initialized successfully
W0203 18:12:33.765546 1637 nvc.c:401] skipping kernel modules load on WSL
I0203 18:12:33.765686 1638 rpc.c:71] starting driver rpc service
I0203 18:12:33.812286 1639 rpc.c:71] starting nvcgo rpc service
I0203 18:12:33.817953 1637 nvc_info.c:759] requesting driver information with ''
I0203 18:12:33.904704 1637 nvc_info.c:198] selecting /usr/lib/wsl/lib/libnvidia-opticalflow.so.1
I0203 18:12:33.905591 1637 nvc_info.c:198] selecting /usr/lib/wsl/lib/libnvidia-ml.so.1
I0203 18:12:33.906351 1637 nvc_info.c:198] selecting /usr/lib/wsl/lib/libnvidia-encode.so.1
I0203 18:12:33.907139 1637 nvc_info.c:198] selecting /usr/lib/wsl/lib/libnvcuvid.so.1
I0203 18:12:33.907224 1637 nvc_info.c:198] selecting /usr/lib/wsl/lib/libdxcore.so
I0203 18:12:33.907257 1637 nvc_info.c:198] selecting /usr/lib/wsl/lib/libcuda.so.1
W0203 18:12:33.907319 1637 nvc_info.c:398] missing library libnvidia-cfg.so
W0203 18:12:33.907338 1637 nvc_info.c:398] missing library libnvidia-nscq.so
W0203 18:12:33.907341 1637 nvc_info.c:398] missing library libnvidia-opencl.so
W0203 18:12:33.907343 1637 nvc_info.c:398] missing library libnvidia-ptxjitcompiler.so
W0203 18:12:33.907345 1637 nvc_info.c:398] missing library libnvidia-fatbinaryloader.so
W0203 18:12:33.907346 1637 nvc_info.c:398] missing library libnvidia-allocator.so
W0203 18:12:33.907348 1637 nvc_info.c:398] missing library libnvidia-compiler.so
W0203 18:12:33.907349 1637 nvc_info.c:398] missing library libnvidia-pkcs11.so
W0203 18:12:33.907351 1637 nvc_info.c:398] missing library libnvidia-ngx.so
W0203 18:12:33.907352 1637 nvc_info.c:398] missing library libvdpau_nvidia.so
W0203 18:12:33.907354 1637 nvc_info.c:398] missing library libnvidia-eglcore.so
W0203 18:12:33.907355 1637 nvc_info.c:398] missing library libnvidia-glcore.so
W0203 18:12:33.907357 1637 nvc_info.c:398] missing library libnvidia-tls.so
W0203 18:12:33.907359 1637 nvc_info.c:398] missing library libnvidia-glsi.so
W0203 18:12:33.907360 1637 nvc_info.c:398] missing library libnvidia-fbc.so
W0203 18:12:33.907362 1637 nvc_info.c:398] missing library libnvidia-ifr.so
W0203 18:12:33.907363 1637 nvc_info.c:398] missing library libnvidia-rtcore.so
W0203 18:12:33.907365 1637 nvc_info.c:398] missing library libnvoptix.so
W0203 18:12:33.907366 1637 nvc_info.c:398] missing library libGLX_nvidia.so
W0203 18:12:33.907368 1637 nvc_info.c:398] missing library libEGL_nvidia.so
W0203 18:12:33.907369 1637 nvc_info.c:398] missing library libGLESv2_nvidia.so
W0203 18:12:33.907371 1637 nvc_info.c:398] missing library libGLESv1_CM_nvidia.so
W0203 18:12:33.907372 1637 nvc_info.c:398] missing library libnvidia-glvkspirv.so
W0203 18:12:33.907374 1637 nvc_info.c:398] missing library libnvidia-cbl.so
W0203 18:12:33.907375 1637 nvc_info.c:402] missing compat32 library libnvidia-ml.so
W0203 18:12:33.907390 1637 nvc_info.c:402] missing compat32 library libnvidia-cfg.so
W0203 18:12:33.907394 1637 nvc_info.c:402] missing compat32 library libnvidia-nscq.so
W0203 18:12:33.907396 1637 nvc_info.c:402] missing compat32 library libcuda.so
W0203 18:12:33.907399 1637 nvc_info.c:402] missing compat32 library libnvidia-opencl.so
W0203 18:12:33.907414 1637 nvc_info.c:402] missing compat32 library libnvidia-ptxjitcompiler.so
W0203 18:12:33.907431 1637 nvc_info.c:402] missing compat32 library libnvidia-fatbinaryloader.so
W0203 18:12:33.907434 1637 nvc_info.c:402] missing compat32 library libnvidia-allocator.so
W0203 18:12:33.907436 1637 nvc_info.c:402] missing compat32 library libnvidia-compiler.so
W0203 18:12:33.907437 1637 nvc_info.c:402] missing compat32 library libnvidia-pkcs11.so
W0203 18:12:33.907439 1637 nvc_info.c:402] missing compat32 library libnvidia-ngx.so
W0203 18:12:33.907441 1637 nvc_info.c:402] missing compat32 library libvdpau_nvidia.so
W0203 18:12:33.907442 1637 nvc_info.c:402] missing compat32 library libnvidia-encode.so
W0203 18:12:33.907444 1637 nvc_info.c:402] missing compat32 library libnvidia-opticalflow.so
W0203 18:12:33.907447 1637 nvc_info.c:402] missing compat32 library libnvcuvid.so
W0203 18:12:33.907461 1637 nvc_info.c:402] missing compat32 library libnvidia-eglcore.so
W0203 18:12:33.907479 1637 nvc_info.c:402] missing compat32 library libnvidia-glcore.so
W0203 18:12:33.907482 1637 nvc_info.c:402] missing compat32 library libnvidia-tls.so
W0203 18:12:33.907484 1637 nvc_info.c:402] missing compat32 library libnvidia-glsi.so
W0203 18:12:33.907486 1637 nvc_info.c:402] missing compat32 library libnvidia-fbc.so
W0203 18:12:33.907488 1637 nvc_info.c:402] missing compat32 library libnvidia-ifr.so
W0203 18:12:33.907489 1637 nvc_info.c:402] missing compat32 library libnvidia-rtcore.so
W0203 18:12:33.907491 1637 nvc_info.c:402] missing compat32 library libnvoptix.so
W0203 18:12:33.907492 1637 nvc_info.c:402] missing compat32 library libGLX_nvidia.so
W0203 18:12:33.907494 1637 nvc_info.c:402] missing compat32 library libEGL_nvidia.so
W0203 18:12:33.907495 1637 nvc_info.c:402] missing compat32 library libGLESv2_nvidia.so
W0203 18:12:33.907499 1637 nvc_info.c:402] missing compat32 library libGLESv1_CM_nvidia.so
W0203 18:12:33.907500 1637 nvc_info.c:402] missing compat32 library libnvidia-glvkspirv.so
W0203 18:12:33.907527 1637 nvc_info.c:402] missing compat32 library libnvidia-cbl.so
W0203 18:12:33.907531 1637 nvc_info.c:402] missing compat32 library libdxcore.so
I0203 18:12:33.908902 1637 nvc_info.c:278] selecting /usr/lib/wsl/drivers/nvlti.inf_amd64_f0a75371d3692c1a/nvidia-smi
W0203 18:12:34.217108 1637 nvc_info.c:424] missing binary nvidia-debugdump
W0203 18:12:34.217139 1637 nvc_info.c:424] missing binary nvidia-persistenced
W0203 18:12:34.217143 1637 nvc_info.c:424] missing binary nv-fabricmanager
W0203 18:12:34.217144 1637 nvc_info.c:424] missing binary nvidia-cuda-mps-control
W0203 18:12:34.217146 1637 nvc_info.c:424] missing binary nvidia-cuda-mps-server
I0203 18:12:34.217164 1637 nvc_info.c:439] skipping path lookup for dxcore
I0203 18:12:34.217179 1637 nvc_info.c:522] listing device /dev/dxg
W0203 18:12:34.217207 1637 nvc_info.c:348] missing ipc path /var/run/nvidia-persistenced/socket
W0203 18:12:34.217248 1637 nvc_info.c:348] missing ipc path /var/run/nvidia-fabricmanager/socket
W0203 18:12:34.217278 1637 nvc_info.c:348] missing ipc path /tmp/nvidia-mps
I0203 18:12:34.217299 1637 nvc_info.c:815] requesting device information with ''
I0203 18:12:34.227700 1637 nvc_info.c:687] listing dxcore adapter 0 (GPU-b5e386b4-3e71-5837-aca5-80c5914cf07f at 00000000:01:00.0)
NVRM version:   510.06
CUDA version:   11.6

Device Index:   0
Device Minor:   0
Model:          NVIDIA GeForce GTX 1650 Ti with Max-Q Design
Brand:          GeForce
GPU UUID:       GPU-b5e386b4-3e71-5837-aca5-80c5914cf07f
Bus Location:   00000000:01:00.0
Architecture:   7.5
I0203 18:12:34.227772 1637 nvc.c:430] shutting down library context
I0203 18:12:34.227859 1639 rpc.c:95] terminating nvcgo rpc service
I0203 18:12:34.228242 1637 rpc.c:135] nvcgo rpc service terminated successfully
I0203 18:12:34.229403 1638 rpc.c:95] terminating driver rpc service
I0203 18:12:34.230364 1637 rpc.c:135] driver rpc service terminated successfully
  • [ ] Kernel version from uname -a
$ uname -a
Linux LAPTOP-E1MFF41S 4.19.128-microsoft-standard #1 SMP Tue Jun 23 12:58:10 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • [ ] Any relevant kernel output lines from dmesg
  • [ ] Driver information from nvidia-smi -a
$ nvidia-smi -a

==============NVSMI LOG==============

Timestamp                                 : Thu Feb  3 13:18:48 2022
Driver Version                            : 510.06
CUDA Version                              : 11.6

Attached GPUs                             : 1
GPU 00000000:01:00.0
   Product Name                          : NVIDIA GeForce GTX 1650 Ti with Max-Q Design
   Product Brand                         : GeForce
   Product Architecture                  : Turing
   Display Mode                          : Enabled
   Display Active                        : Enabled
   Persistence Mode                      : Enabled
   MIG Mode
       Current                           : N/A
       Pending                           : N/A
   Accounting Mode                       : Disabled
   Accounting Mode Buffer Size           : 4000
   Driver Model
       Current                           : WDDM
       Pending                           : WDDM
   Serial Number                         : N/A
   GPU UUID                              : GPU-b5e386b4-3e71-5837-aca5-80c5914cf07f
   Minor Number                          : N/A
   VBIOS Version                         : 90.17.41.00.46
   MultiGPU Board                        : No
   Board ID                              : 0x100
   GPU Part Number                       : N/A
   Module ID                             : 0
   Inforom Version
       Image Version                     : G001.0000.02.04
       OEM Object                        : 1.1
       ECC Object                        : N/A
       Power Management Object           : N/A
   GPU Operation Mode
       Current                           : N/A
       Pending                           : N/A
   GSP Firmware Version                  : N/A
   GPU Virtualization Mode
       Virtualization Mode               : None
       Host VGPU Mode                    : N/A
   IBMNPU
       Relaxed Ordering Mode             : N/A
   PCI
       Bus                               : 0x01
       Device                            : 0x00
       Domain                            : 0x0000
       Device Id                         : 0x1F9510DE
       Bus Id                            : 00000000:01:00.0
       Sub System Id                     : 0x22C017AA
       GPU Link Info
           PCIe Generation
               Max                       : 3
               Current                   : 3
           Link Width
               Max                       : 16x
               Current                   : 16x
       Bridge Chip
           Type                          : N/A
           Firmware                      : N/A
       Replays Since Reset               : 0
       Replay Number Rollovers           : 0
       Tx Throughput                     : 218000 KB/s
       Rx Throughput                     : 1000 KB/s
   Fan Speed                             : N/A
   Performance State                     : P8
   Clocks Throttle Reasons
       Idle                              : Active
       Applications Clocks Setting       : Not Active
       SW Power Cap                      : Not Active
       HW Slowdown                       : Not Active
           HW Thermal Slowdown           : Not Active
           HW Power Brake Slowdown       : Not Active
       Sync Boost                        : Not Active
       SW Thermal Slowdown               : Not Active
       Display Clock Setting             : Not Active
   FB Memory Usage
       Total                             : 4096 MiB
       Used                              : 1337 MiB
       Free                              : 2759 MiB
   BAR1 Memory Usage
       Total                             : 256 MiB
       Used                              : 2 MiB
       Free                              : 254 MiB
   Compute Mode                          : Default
   Utilization
       Gpu                               : N/A
       Memory                            : N/A
       Encoder                           : 0 %
       Decoder                           : 0 %
   Encoder Stats
       Active Sessions                   : 0
       Average FPS                       : 0
       Average Latency                   : 0
   FBC Stats
       Active Sessions                   : 0
       Average FPS                       : 0
       Average Latency                   : 0
   Ecc Mode
       Current                           : N/A
       Pending                           : N/A
   ECC Errors
       Volatile
           SRAM Correctable              : N/A
           SRAM Uncorrectable            : N/A
           DRAM Correctable              : N/A
           DRAM Uncorrectable            : N/A
       Aggregate
           SRAM Correctable              : N/A
           SRAM Uncorrectable            : N/A
           DRAM Correctable              : N/A
           DRAM Uncorrectable            : N/A
   Retired Pages
       Single Bit ECC                    : N/A
       Double Bit ECC                    : N/A
       Pending Page Blacklist            : N/A
   Remapped Rows                         : N/A
   Temperature
       GPU Current Temp                  : 40 C
       GPU Shutdown Temp                 : 102 C
       GPU Slowdown Temp                 : 97 C
       GPU Max Operating Temp            : 75 C
       GPU Target Temperature            : N/A
       Memory Current Temp               : N/A
       Memory Max Operating Temp         : N/A
   Power Readings
       Power Management                  : N/A
       Power Draw                        : 3.99 W
       Power Limit                       : N/A
       Default Power Limit               : N/A
       Enforced Power Limit              : N/A
       Min Power Limit                   : N/A
       Max Power Limit                   : N/A
   Clocks
       Graphics                          : 77 MHz
       SM                                : 77 MHz
       Memory                            : 197 MHz
       Video                             : 540 MHz
   Applications Clocks
       Graphics                          : N/A
       Memory                            : N/A
   Default Applications Clocks
       Graphics                          : N/A
       Memory                            : N/A
   Max Clocks
       Graphics                          : 2100 MHz
       SM                                : 2100 MHz
       Memory                            : 5001 MHz
       Video                             : 1950 MHz
   Max Customer Boost Clocks
       Graphics                          : N/A
   Clock Policy
       Auto Boost                        : N/A
       Auto Boost Default                : N/A
   Voltage
       Graphics                          : N/A
   Processes                             : None
  • [ ] Docker version from docker version
Client:
 Version:           20.10.7
 API version:       1.41
 Go version:        go1.13.8
 Git commit:        20.10.7-0ubuntu5~20.04.2
 Built:             Mon Nov  1 00:34:17 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.12
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.12
  Git commit:       459d0df
  Built:            Mon Dec 13 11:43:56 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.12
  GitCommit:        7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  • [ ] NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'
$ dpkg -l '*nvidia*'` _or_ `rpm -qa '*nvidia*'
_or_: command not found
dpkg-query: no packages found matching *nvidia*rpm
dpkg-query: no packages found matching -qa
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                             Version      Architecture Description
+++-================================-============-============-=====================================================
un  libgldispatch0-nvidia            <none>       <none>       (no description available)
ii  libnvidia-container-tools        1.8.0~rc.2-1 amd64        NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64       1.8.0~rc.2-1 amd64        NVIDIA container runtime library
un  nvidia-common                    <none>       <none>       (no description available)
un  nvidia-container-runtime         <none>       <none>       (no description available)
un  nvidia-container-runtime-hook    <none>       <none>       (no description available)
ii  nvidia-container-toolkit         1.8.0~rc.2-1 amd64        NVIDIA container runtime hook
un  nvidia-docker                    <none>       <none>       (no description available)
ii  nvidia-docker2                   2.8.0-1      all          nvidia-docker CLI wrapper
un  nvidia-legacy-304xx-vdpau-driver <none>       <none>       (no description available)
un  nvidia-legacy-340xx-vdpau-driver <none>       <none>       (no description available)
un  nvidia-libopencl1-dev            <none>       <none>       (no description available)
un  nvidia-prime                     <none>       <none>       (no description available)
un  nvidia-vdpau-driver              <none>       <none>       (no description available)
  • [ ] NVIDIA container library version from nvidia-container-cli -V
$ nvidia-container-cli -V
cli-version: 1.8.0~rc.2
lib-version: 1.8.0~rc.2
build date: 2022-01-28T10:54+00:00
build revision: d48f9b0d505fca0aff7c88cee790f9c56aa1b851
build compiler: x86_64-linux-gnu-gcc-7 7.5.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
  • [ ] NVIDIA container library logs (see troubleshooting)
  • [ ] Docker command, image and tag used

boyang9602 avatar Feb 03 '22 18:02 boyang9602

It's a strange bug, because GPU is available despite the error message, it's fixed in later images (don't mind nvidia-smi and driver version, it's the same with 510.06):

➜ docker run --gpus all --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tensorflow:22.01-tf2-py3 nvidia-smi

================
== TensorFlow ==
================

NVIDIA Release 22.01-tf2 (build 31081301)
TensorFlow Version 2.7.0

Container image Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copyright 2017-2022 The TensorFlow Authors.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

NOTE: MOFED driver for multi-node communication was not detected.
      Multi-node communication performance may be reduced.

Fri Feb  4 11:22:39 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.39.01    Driver Version: 511.23       CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   56C    P8     4W /  N/A |    312MiB /  6144MiB |     15%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

compared to 20.03:

➜ docker run --gpus all --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tensorflow:20.03-tf2-py3 nvidia-smi


================
== TensorFlow ==
================

NVIDIA Release 20.03-tf2 (build 11026100)
TensorFlow Version 2.1.0

Container image Copyright (c) 2019, NVIDIA CORPORATION.  All rights reserved.
Copyright 2017-2019 The TensorFlow Authors.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.

WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
   Use 'nvidia-docker run' to start this container; see
   https://github.com/NVIDIA/nvidia-docker/wiki/nvidia-docker .

NOTE: MOFED driver for multi-node communication was not detected.
      Multi-node communication performance may be reduced.

Fri Feb  4 11:23:46 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.39.01    Driver Version: 511.23       CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   53C    P8     3W /  N/A |    316MiB /  6144MiB |      4%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Btw, try updating your wsl kernel, 4.19 is pretty old

PQLLUX avatar Feb 04 '22 11:02 PQLLUX

I have the same issue, exactly the same setup, except for the newest kernel version. Tried modprobe nvidia and got: modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.10.60.1-microsoft-standard-WSL2 The GPU is detected and theoretically runs, in the container, but it only reserves the GPU memory. Utilization of the GPU stays at 0% which means that my cpu can perform faster calculations. Anyone found a solution yet?

Wojak27 avatar Feb 07 '22 22:02 Wojak27

I exactly have the same issue on WSL 2.

florian6973 avatar Mar 31 '22 20:03 florian6973

I've solved my issue by using the newest Nvidia docker container. For some reason the gpu is fully utilized now

Wojak27 avatar Mar 31 '22 22:03 Wojak27