nvidia-docker icon indicating copy to clipboard operation
nvidia-docker copied to clipboard

"Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]" or "Error response from daemon: failed to create shim task:"

Open revanthsenthil opened this issue 2 years ago • 3 comments

1. Issues

Issue 1:

[+] Running 3/0
 ⠿ Container gcs      Created                                                                   0.0s
 ⠿ Container onboard  Created                                                                   0.0s
 ⠿ Container px4      Created                                                                   0.0s
Attaching to gcs, onboard, px4, sim
Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown

The above error is prompted when I run docker compose up on the docker containers pulled from the following repository - https://github.com/jgoppert/auav_f22

Issue 2:

An error that also happens when this error is not prompted is:

Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]

This also is from running the same command - docker compose up for the same containers

2. Steps to reproduce the issue

The instructions to setup the docker containers as in the repo linked above were followed, but some notable steps include using aptitude instead of apt to make sure dependencies were installed as necessary, as previously, I had to use Synaptic to try and find dependencies that had to be installed/removed for the required nvidia drivers.

I am running Ubuntu 22.04 and as indicated below, an x86 system, so the :i386 should technically not be installed but they do exist.

3. Information to attach

  • [ ] Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
-- WARNING, the following logs are for debugging purposes only --

I0907 19:23:08.499383 476595 nvc.c:376] initializing library context (version=1.10.0, build=395fd41701117121f1fd04ada01e1d7e006a37ae)
I0907 19:23:08.499523 476595 nvc.c:350] using root /
I0907 19:23:08.499547 476595 nvc.c:351] using ldcache /etc/ld.so.cache
I0907 19:23:08.499572 476595 nvc.c:352] using unprivileged user 1000:1000
I0907 19:23:08.499634 476595 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0907 19:23:08.500224 476595 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment
W0907 19:23:12.926908 476604 nvc.c:273] failed to set inheritable capabilities
W0907 19:23:12.926942 476604 nvc.c:274] skipping kernel modules load due to failure
I0907 19:23:12.927351 476605 rpc.c:71] starting driver rpc service
I0907 19:23:12.933072 476606 rpc.c:71] starting nvcgo rpc service
I0907 19:23:12.933754 476595 nvc_info.c:766] requesting driver information with ''
I0907 19:23:12.935676 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.515.65.01
I0907 19:23:12.935798 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.515.65.01
I0907 19:23:12.935876 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.515.65.01
I0907 19:23:12.935955 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.515.65.01
I0907 19:23:12.936044 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.515.65.01
I0907 19:23:12.936106 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.515.65.01
I0907 19:23:12.936160 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.515.65.01
I0907 19:23:12.936218 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.515.65.01
I0907 19:23:12.936294 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.515.65.01
I0907 19:23:12.936347 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.515.65.01
I0907 19:23:12.936417 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.515.65.01
I0907 19:23:12.936526 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.515.65.01
I0907 19:23:12.936791 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.515.65.01
I0907 19:23:12.937040 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.515.65.01
I0907 19:23:12.937135 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.515.65.01
I0907 19:23:12.937257 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.515.65.01
I0907 19:23:12.937322 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.515.65.01
W0907 19:23:12.937418 476595 nvc_info.c:399] missing library libnvidia-cfg.so
W0907 19:23:12.937427 476595 nvc_info.c:399] missing library libnvidia-nscq.so
W0907 19:23:12.937432 476595 nvc_info.c:399] missing library libcudadebugger.so
W0907 19:23:12.937437 476595 nvc_info.c:399] missing library libnvidia-fatbinaryloader.so
W0907 19:23:12.937444 476595 nvc_info.c:399] missing library libnvidia-allocator.so
W0907 19:23:12.937451 476595 nvc_info.c:399] missing library libnvidia-pkcs11.so
W0907 19:23:12.937459 476595 nvc_info.c:399] missing library libvdpau_nvidia.so
W0907 19:23:12.937466 476595 nvc_info.c:399] missing library libnvidia-encode.so
W0907 19:23:12.937473 476595 nvc_info.c:399] missing library libnvidia-opticalflow.so
W0907 19:23:12.937480 476595 nvc_info.c:399] missing library libnvcuvid.so
W0907 19:23:12.937487 476595 nvc_info.c:399] missing library libnvidia-fbc.so
W0907 19:23:12.937494 476595 nvc_info.c:399] missing library libnvidia-ifr.so
W0907 19:23:12.937501 476595 nvc_info.c:399] missing library libnvidia-cbl.so
W0907 19:23:12.937520 476595 nvc_info.c:403] missing compat32 library libnvidia-ml.so
W0907 19:23:12.937546 476595 nvc_info.c:403] missing compat32 library libnvidia-cfg.so
W0907 19:23:12.937552 476595 nvc_info.c:403] missing compat32 library libnvidia-nscq.so
W0907 19:23:12.937558 476595 nvc_info.c:403] missing compat32 library libcuda.so
W0907 19:23:12.937564 476595 nvc_info.c:403] missing compat32 library libcudadebugger.so
W0907 19:23:12.937572 476595 nvc_info.c:403] missing compat32 library libnvidia-opencl.so
W0907 19:23:12.937579 476595 nvc_info.c:403] missing compat32 library libnvidia-ptxjitcompiler.so
W0907 19:23:12.937585 476595 nvc_info.c:403] missing compat32 library libnvidia-fatbinaryloader.so
W0907 19:23:12.937591 476595 nvc_info.c:403] missing compat32 library libnvidia-allocator.so
W0907 19:23:12.937599 476595 nvc_info.c:403] missing compat32 library libnvidia-compiler.so
W0907 19:23:12.937605 476595 nvc_info.c:403] missing compat32 library libnvidia-pkcs11.so
W0907 19:23:12.937612 476595 nvc_info.c:403] missing compat32 library libnvidia-ngx.so
W0907 19:23:12.937619 476595 nvc_info.c:403] missing compat32 library libvdpau_nvidia.so
W0907 19:23:12.937626 476595 nvc_info.c:403] missing compat32 library libnvidia-encode.so
W0907 19:23:12.937633 476595 nvc_info.c:403] missing compat32 library libnvidia-opticalflow.so
W0907 19:23:12.937640 476595 nvc_info.c:403] missing compat32 library libnvcuvid.so
W0907 19:23:12.937647 476595 nvc_info.c:403] missing compat32 library libnvidia-eglcore.so
W0907 19:23:12.937654 476595 nvc_info.c:403] missing compat32 library libnvidia-glcore.so
W0907 19:23:12.937661 476595 nvc_info.c:403] missing compat32 library libnvidia-tls.so
W0907 19:23:12.937668 476595 nvc_info.c:403] missing compat32 library libnvidia-glsi.so
W0907 19:23:12.937676 476595 nvc_info.c:403] missing compat32 library libnvidia-fbc.so
W0907 19:23:12.937683 476595 nvc_info.c:403] missing compat32 library libnvidia-ifr.so
W0907 19:23:12.937690 476595 nvc_info.c:403] missing compat32 library libnvidia-rtcore.so
W0907 19:23:12.937697 476595 nvc_info.c:403] missing compat32 library libnvoptix.so
W0907 19:23:12.937704 476595 nvc_info.c:403] missing compat32 library libGLX_nvidia.so
W0907 19:23:12.937711 476595 nvc_info.c:403] missing compat32 library libEGL_nvidia.so
W0907 19:23:12.937718 476595 nvc_info.c:403] missing compat32 library libGLESv2_nvidia.so
W0907 19:23:12.937725 476595 nvc_info.c:403] missing compat32 library libGLESv1_CM_nvidia.so
W0907 19:23:12.937732 476595 nvc_info.c:403] missing compat32 library libnvidia-glvkspirv.so
W0907 19:23:12.937739 476595 nvc_info.c:403] missing compat32 library libnvidia-cbl.so
I0907 19:23:12.938079 476595 nvc_info.c:299] selecting /usr/bin/nvidia-smi
I0907 19:23:12.938120 476595 nvc_info.c:299] selecting /usr/bin/nvidia-debugdump
W0907 19:23:12.938590 476595 nvc_info.c:425] missing binary nvidia-persistenced
W0907 19:23:12.938596 476595 nvc_info.c:425] missing binary nv-fabricmanager
W0907 19:23:12.938600 476595 nvc_info.c:425] missing binary nvidia-cuda-mps-control
W0907 19:23:12.938604 476595 nvc_info.c:425] missing binary nvidia-cuda-mps-server
W0907 19:23:12.938653 476595 nvc_info.c:349] missing firmware path /lib/firmware/nvidia/515.65.01/gsp.bin
I0907 19:23:12.938696 476595 nvc_info.c:529] listing device /dev/nvidiactl
I0907 19:23:12.938700 476595 nvc_info.c:529] listing device /dev/nvidia-uvm
I0907 19:23:12.938705 476595 nvc_info.c:529] listing device /dev/nvidia-uvm-tools
I0907 19:23:12.938709 476595 nvc_info.c:529] listing device /dev/nvidia-modeset
W0907 19:23:12.938768 476595 nvc_info.c:349] missing ipc path /var/run/nvidia-persistenced/socket
W0907 19:23:12.938811 476595 nvc_info.c:349] missing ipc path /var/run/nvidia-fabricmanager/socket
W0907 19:23:12.938829 476595 nvc_info.c:349] missing ipc path /tmp/nvidia-mps
I0907 19:23:12.938834 476595 nvc_info.c:822] requesting device information with ''
I0907 19:23:12.947131 476595 nvc_info.c:713] listing device /dev/nvidia0 (GPU-21e3065c-8a1a-6cf7-b0fd-41d6c51f726e at 00000000:01:00.0)
NVRM version:   515.65.01
CUDA version:   11.7

Device Index:   0
Device Minor:   0
Model:          NVIDIA GeForce GTX 1650
Brand:          GeForce
GPU UUID:       GPU-21e3065c-8a1a-6cf7-b0fd-41d6c51f726e
Bus Location:   00000000:01:00.0
Architecture:   7.5
I0907 19:23:12.947333 476595 nvc.c:434] shutting down library context
I0907 19:23:12.947490 476606 rpc.c:95] terminating nvcgo rpc service
I0907 19:23:12.949168 476595 rpc.c:135] nvcgo rpc service terminated successfully
I0907 19:23:12.954386 476605 rpc.c:95] terminating driver rpc service
I0907 19:23:12.954810 476595 rpc.c:135] driver rpc service terminated successfully
  • [ ] Kernel version from uname -a
Linux revanth-XPS-15-7590 5.15.0-47-generic #51-Ubuntu SMP Thu Aug 11 07:51:15 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • [ ] Driver information from nvidia-smi -a
==============NVSMI LOG==============

Timestamp                                 : Wed Sep  7 15:26:42 2022
Driver Version                            : 515.65.01
CUDA Version                              : 11.7

Attached GPUs                             : 1
GPU 00000000:01:00.0
    Product Name                          : NVIDIA GeForce GTX 1650
    Product Brand                         : GeForce
    Product Architecture                  : Turing
    Display Mode                          : Disabled
    Display Active                        : Disabled
    Persistence Mode                      : Disabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : N/A
    GPU UUID                              : GPU-21e3065c-8a1a-6cf7-b0fd-41d6c51f726e
    Minor Number                          : 0
    VBIOS Version                         : 90.17.1C.40.4B
    MultiGPU Board                        : No
    Board ID                              : 0x100
    GPU Part Number                       : N/A
    Module ID                             : 0
    Inforom Version
        Image Version                     : G001.0000.02.04
        OEM Object                        : 1.1
        ECC Object                        : N/A
        Power Management Object           : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GSP Firmware Version                  : N/A
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x01
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x1F9110DE
        Bus Id                            : 00000000:01:00.0
        Sub System Id                     : 0x8601103C
        GPU Link Info
            PCIe Generation
                Max                       : 3
                Current                   : 3
            Link Width
                Max                       : 16x
                Current                   : 16x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 0 KB/s
        Rx Throughput                     : 0 KB/s
    Fan Speed                             : N/A
    Performance State                     : P3
    Clocks Throttle Reasons
        Idle                              : Not Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Active
        Display Clock Setting             : Not Active
    FB Memory Usage
        Total                             : 4096 MiB
        Reserved                          : 181 MiB
        Used                              : 10 MiB
        Free                              : 3904 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 3 MiB
        Free                              : 253 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 0 %
        Memory                            : 0 %
        Encoder                           : 0 %
        Decoder                           : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    Ecc Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
        Aggregate
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 43 C
        GPU Shutdown Temp                 : 102 C
        GPU Slowdown Temp                 : 97 C
        GPU Max Operating Temp            : 75 C
        GPU Target Temperature            : N/A
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : N/A
        Power Draw                        : 11.96 W
        Power Limit                       : N/A
        Default Power Limit               : N/A
        Enforced Power Limit              : N/A
        Min Power Limit                   : N/A
        Max Power Limit                   : N/A
    Clocks
        Graphics                          : 1395 MHz
        SM                                : 1395 MHz
        Memory                            : 3500 MHz
        Video                             : 1290 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Max Clocks
        Graphics                          : 2100 MHz
        SM                                : 2100 MHz
        Memory                            : 4001 MHz
        Video                             : 1950 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : N/A
    Processes
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 3541
            Type                          : G
            Name                          : /usr/lib/xorg/Xorg
            Used GPU Memory               : 4 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 468410
            Type                          : C+G
            Name                          : /opt/google/chrome/chrome --type=gpu-process --enable-crashpad --crashpad-handler-pid=5316 --enable-crash-reporter=e3964464-0402-4a0b-9245-6fd60cb8f256, --change-stack-guard-on-fork=enable --gpu-preferences=WAAAAAAAAAAgAAAIAAAAAAAAAAAAAAAAAABgAAEAAAA4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAABAAAAAAAAAAgAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --shared-files --field-trial-handle=0,i,16466788088194899339,13002346894531801217,131072
            Used GPU Memory               : 4 MiB
  • [ ] Docker version from docker version
20.10.17
Client: Docker Engine - Community
 Cloud integration: v1.0.28
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:02:46 2022
 OS/Arch:           linux/amd64
 Context:           desktop-linux
 Experimental:      true

Server: Docker Desktop 4.11.0 (83626)
 Engine:
  Version:          20.10.17
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.11
  Git commit:       a89b842
  Built:            Mon Jun  6 23:01:23 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.6
  GitCommit:        ****************************
 runc:
  Version:          1.1.2
  GitCommit:        v1.1.2-0-ga916309
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

  • [ ] NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                Version                    Architecture Description
+++-===================================-==========================-============-====================>
un  libgldispatch0-nvidia               <none>                     <none>       (no description avai>
un  libnvidia-common                    <none>                     <none>       (no description avai>
ii  libnvidia-common-515-server         515.65.01-0ubuntu0.22.04.1 all          Shared files used by>
un  libnvidia-compute                   <none>                     <none>       (no description avai>
ii  libnvidia-compute-515:amd64         515.65.01-0ubuntu0.22.04.1 amd64        NVIDIA libcompute pa>
ii  libnvidia-container-tools           1.10.0-1                   amd64        NVIDIA container run>
ii  libnvidia-container1:amd64          1.10.0-1                   amd64        NVIDIA container run>
ii  libnvidia-egl-wayland1:amd64        1:1.1.9-1.1                amd64        Wayland EGL External>
un  libnvidia-encode1                   <none>                     <none>       (no description avai>
un  libnvidia-gl                        <none>                     <none>       (no description avai>
un  libnvidia-gl-390                    <none>                     <none>       (no description avai>
un  libnvidia-gl-410                    <none>                     <none>       (no description avai>
ii  libnvidia-gl-515-server:amd64       515.65.01-0ubuntu0.22.04.1 amd64        NVIDIA OpenGL/GLX/EG>
un  libnvidia-legacy-390xx-egl-wayland1 <none>                     <none>       (no description avai>
un  libnvidia-ml1                       <none>                     <none>       (no description avai>
un  nvidia-384                          <none>                     <none>       (no description avai>
un  nvidia-390                          <none>                     <none>       (no description avai>
un  nvidia-common                       <none>                     <none>       (no description avai>
un  nvidia-compute-utils                <none>                     <none>       (no description avai>
rc  nvidia-compute-utils-515            515.65.01-0ubuntu0.22.04.1 amd64        NVIDIA compute utili>
un  nvidia-container-runtime            <none>                     <none>       (no description avai>
un  nvidia-container-runtime-hook       <none>                     <none>       (no description avai>
ii  nvidia-container-toolkit            1.10.0-1                   amd64        NVIDIA container run>
rc  nvidia-dkms-515                     515.65.01-0ubuntu0.22.04.1 amd64        NVIDIA DKMS package
un  nvidia-dkms-kernel                  <none>                     <none>       (no description avai>
un  nvidia-docker                       <none>                     <none>       (no description avai>
ii  nvidia-docker2                      2.11.0-1                   all          nvidia-docker CLI wr>
un  nvidia-driver-515                   <none>                     <none>       (no description avai>
un  nvidia-egl-wayland-common           <none>                     <none>       (no description avai>
un  nvidia-kernel-common                <none>                     <none>       (no description avai>
rc  nvidia-kernel-common-515            515.65.01-0ubuntu0.22.04.1 amd64        Shared files used wi>
un  nvidia-kernel-source-515            <none>                     <none>       (no description avai>
un  nvidia-libopencl1-dev               <none>                     <none>       (no description avai>
un  nvidia-opencl-icd                   <none>                     <none>       (no description avai>
un  nvidia-persistenced                 <none>                     <none>       (no description avai>
rc  nvidia-prime                        0.8.17.1                   all          Tools to enable NVID>
ii  nvidia-settings                     510.47.03-0ubuntu1         amd64        Tool for configuring>
un  nvidia-settings-binary              <none>                     <none>       (no description avai>
un  nvidia-smi                          <none>                     <none>       (no description avai>
un  nvidia-utils                        <none>                     <none>       (no description avai>
ii  nvidia-utils-515                    515.65.01-0ubuntu0.22.04.1 amd64        NVIDIA driver suppor>

  • [ ] NVIDIA container library version from nvidia-container-cli -V
cli-version: 1.10.0
lib-version: 1.10.0
build date: 2022-06-13T10:39+00:00
build revision: 395fd41701117121f1fd04ada01e1d7e006a37ae
build compiler: x86_64-linux-gnu-gcc-7 7.5.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections

  • [ ] NVIDIA container library logs

The above output in Issues is when the debug=... line was uncommented. When the debug=... line was commented, the output was as follows:

+] Running 3/0
 ⠿ Container px4      Created                                                                   0.0s
 ⠿ Container gcs      Created                                                                   0.0s
 ⠿ Container onboard  Created                                                                   0.0s
Attaching to gcs, onboard, px4, sim
px4      | Unable to init server: Could not connect: Connection refused
px4      | Unable to init server: Could not connect: Connection refused
px4      | You need to run terminator in an X environment. Make sure $DISPLAY is properly set
Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown
  • [ ] Docker command, image and tag used

As linked previously, 4 docker containers from https://github.com/jgoppert/auav_f22 Command causing issue: docker compose up

revanthsenthil avatar Sep 07 '22 19:09 revanthsenthil

Having the same issue on Leap 15.3 Kernel: 5.19.8-lp153.2.g0330383-default

Error is returned after updating: libnvidia-container1 1.10.0-1 -> 1.11.0-1 libnvidia-container-tools 1.10.0-1 -> 1.11.0-1 nvidia-container-toolkit 1.10.0-1 -> 1.11.0-1

nvidia-docker2 version: 2.11.0-1 Docker version 20.10.17-ce, build a89b84221c85

riddlecp avatar Sep 16 '22 19:09 riddlecp

@riddlecp there seems to be an issue with the v1.11.0 package that means that upgrading from 1.10.0 to 1.11.0 may not work as expected. Could you try to remove nvidia-container-toolkit entirely and reinstall the v1.11.0 version?

See https://github.com/NVIDIA/nvidia-docker/issues/1682#issuecomment-1250952249 for more context.

elezar avatar Sep 19 '22 14:09 elezar

Thanks Elezar, I saw that thread this morning and was attempting when you replied. Removing the nvidia container toolkit and installing back fixed the issue. I did notice that it attempts to uninstall nvidia-docker2 as part of the removal, so I just reinstalled it as well.

riddlecp avatar Sep 19 '22 15:09 riddlecp