libc6-shim icon indicating copy to clipboard operation
libc6-shim copied to clipboard

CUDA/OpenCL on RTX5070 does not work.

Open cederom opened this issue 5 months ago • 8 comments

Hello world :-)

First of all BIG THANK YOU @shkhln for this solution that allows running Linux and Nvidia binary stuff on FreeBSD. Shame nvidia ignores FreeBSD and Open-Source but well over 15 years of asking won't change anything.

I have recently replaced my GTX1060 with RTX5070 and here CUDA/OpenCL stopped working. It worked fine on GTX1060. The difference is RTX5070 needs to use GSP Firmware in order to start Xorg. Maybe some new ioclt is required? Tried 550.127.05, 575.64.03, and 575.64.03 drivers.

Here is the bug report on FreeBSD's bugzilla: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=287895

GTX1060:

# nv-sglrun nvidia-smi
shim init
Tue Feb  4 11:16:02 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.05             Driver Version: 550.127.05     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1060 6GB    Off |   00000000:01:00.0  On |                  N/A |
| 29%   50C    P0             26W /  120W |    1220MiB /   6144MiB |      4%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+


# nv-sglrun clpeak
shim init

Platform: NVIDIA CUDA
  Device: NVIDIA GeForce GTX 1060 6GB
    Driver version  : 550.127.05 (FreeBSD)
    Compute units   : 10
    Clock frequency : 1708 MHz

    Global memory bandwidth (GBPS)
      float   : 138.25
      float2  : 142.37
      float4  : 147.13
      float8  : 147.19
      float16 : 98.19

    Single-precision compute (GFLOPS)
      float   : 4108.05
      float2  : 4312.85
      float4  : 4283.72
      float8  : 4249.30
      float16 : 4226.22

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 140.94
      double2  : 140.18
      double4  : 139.88
      double8  : 138.68
      double16 : 139.25

    Integer compute (GIOPS)
      int   : 1441.66
      int2  : 1421.13
      int4  : 1431.67
      int8  : 1316.01
      int16 : 1299.19

    Integer compute Fast 24bit (GIOPS)
      int   : 1428.16
      int2  : 1394.27
      int4  : 1415.95
      int8  : 1411.34
      int16 : 1388.43

    Integer char (8bit) compute (GIOPS)
      char   : 3871.67
      char2  : 4115.16
      char4  : 4133.75
      char8  : 4086.24
      char16 : 4042.09

    Integer short (16bit) compute (GIOPS)
      short   : 3798.35
      short2  : 3925.07
      short4  : 4028.50
      short8  : 4101.96
      short16 : 4036.17

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 5.14
      enqueueReadBuffer               : 5.82
      enqueueWriteBuffer non-blocking : 4.99
      enqueueReadBuffer non-blocking  : 5.46
      enqueueMapBuffer(for read)      : 5.86
        memcpy from mapped ptr        : 4.16
      enqueueUnmap(after write)       : 5.93
        memcpy to mapped ptr          : 4.18

    Kernel launch latency : 7.88 us

RTX5070:

% uname -a
FreeBSD hexagon 14.2-RELEASE-p3 FreeBSD 14.2-RELEASE-p3 #0 releng/14.2-n269524-1eb03b059e56-dirty: Tue Jun 24 13:08:04 CEST 2025     root@hexagon:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64


% pkg info -x nvidia
linux-nvidia-libs-570.169
nvidia-driver-570.169.1402000
nvidia-drm-61-kmod-570.169.1402000_2
nvidia-drm-kmod-570.169
nvidia-settings-570.169
nvidia-xconfig-570.169


% nv-sglrun nvidia-smi
/usr/local/lib/libc6-shim/libc6.so: shim init
Thu Jun 26 16:13:06 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.169                Driver Version: 570.169        CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5070        Off |   00000000:02:00.0  On |                  N/A |
|  0%   38C    P0             31W /  250W |    1204MiB /  12227MiB |      4%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+


% nv-sglrun clpeak
/usr/local/lib/libc6-shim/libc6.so: shim init

Platform: NVIDIA CUDA
  Device: NVIDIA GeForce RTX 5070
    Driver version  : 570.169 (FreeBSD)
    Compute units   : 48
    Clock frequency : 2610 MHz

    Global memory bandwidth (GBPS)
zsh: segmentation fault (core dumped)  nv-sglrun clpeak


% nv-sglrun nvidia-smi
/usr/local/lib/libc6-shim/libc6.so: shim init
Mon Jun 23 15:37:37 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.153.02             Driver Version: 570.153.02     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5070        Off |   00000000:02:00.0  On |                  N/A |
|  0%   46C    P0             31W /  250W |    1500MiB /  12227MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

% nv-sglrun clpeak
/usr/local/lib/libc6-shim/libc6.so: shim init

Platform: NVIDIA CUDA
  Device: NVIDIA GeForce RTX 5070
    Driver version  : 570.153.02 (FreeBSD)
    Compute units   : 48
    Clock frequency : 2610 MHz

    Global memory bandwidth (GBPS)
zsh: segmentation fault (core dumped)  nv-sglrun clpeak

% lldb -c clpeak.core
(lldb) target create --core "clpeak.core"
Core file '/XXX/clpeak.core' (x86_64) was loaded.
(lldb) bt
* thread #1, name = 'clpeak', stop reason = signal SIGSEGV
  * frame #0: 0x000000082765aa78
    frame #1: 0x0000000820b93ef0

Here is my latest try with x11/nvidia-driver-devel stuff even beofre it was committed to the ports:

% nvidia-smi
Fri Jul  4 04:09:23 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.64.03              Driver Version: 575.64.03      CUDA Version: N/A      |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5070        Off |   00000000:02:00.0  On |                  N/A |
|  0%   37C    P0             28W /  250W |     849MiB /  12227MiB |     21%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            1910      G   /usr/local/bin/firefox                  288MiB |
|    0   N/A  N/A           82607      G   /usr/local/libexec/Xorg                 331MiB |
|    0   N/A  N/A           87624      G   /usr/local/bin/enlightenment            152MiB |
+-----------------------------------------------------------------------------------------+


hexagon% nv-sglrun nvidia-smi
/usr/local/lib/libc6-shim/libc6.so: shim init
Fri Jul  4 04:09:32 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.64.03              Driver Version: 575.64.03      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5070        Off |   00000000:02:00.0  On |                  N/A |
|  0%   36C    P0             27W /  250W |     849MiB /  12227MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+


hexagon% nv-sglrun clpeak
/usr/local/lib/libc6-shim/libc6.so: shim init

Platform: NVIDIA CUDA
  Device: NVIDIA GeForce RTX 5070
    Driver version  : 575.64.03 (FreeBSD)
    Compute units   : 48
    Clock frequency : 2610 MHz

    Global memory bandwidth (GBPS)
zsh: segmentation fault  nv-sglrun clpeak


hexagon% pkg info -x nvidia
linux-nvidia-libs-devel-575.64.03
nvidia-driver-devel-575.64.03.1402000
nvidia-drm-61-kmod-devel-575.64.03.1402000_2
nvidia-drm-kmod-devel-575.64.03
nvidia-settings-570.169
nvidia-xconfig-570.169

Here may be a clue?

% nv-sglrun nvidia-smi -q
/usr/local/lib/libc6-shim/libc6.so: shim init

==============NVSMI LOG==============

Timestamp                                 : Mon Jul  7 04:25:06 2025
Driver Version                            : 575.64.03
CUDA Version                              : 12.9

Attached GPUs                             : 1
GPU 00000000:02:00.0
    Product Name                          : NVIDIA GeForce RTX 5070
    Product Brand                         : GeForce
    Product Architecture                  : Blackwell
    Display Mode                          : Requested functionality has been deprecated
    Display Attached                      : Yes
    Display Active                        : Enabled
    Persistence Mode                      : Disabled
shim_ioctl_impl(-1, 0x27, _) is not implemented
0x825109578 <shim_ioctl+0x288> at /usr/local/lib/libc6-shim/libc6.so
0x829c501f4 <nvmlDeviceGetVgpuTypeCreatablePlacements+0xaa6c4> at /compat/linux/usr/lib64/libnvidia-ml.so.1
0x829c510ee <nvmlDeviceGetVgpuTypeCreatablePlacements+0xab5be> at /compat/linux/usr/lib64/libnvidia-ml.so.1
0x829c2172b <nvmlDeviceGetVgpuTypeCreatablePlacements+0x7bbfb> at /compat/linux/usr/lib64/libnvidia-ml.so.1
0x829b4be14 <nvmlInternalGetExportTable+0x1fce4> at /compat/linux/usr/lib64/libnvidia-ml.so.1
0x412735 <???> at /usr/local/bin/nvidia-smi
0x423a60 <???> at /usr/local/bin/nvidia-smi
0x404d82 <???> at /usr/local/bin/nvidia-smi
zsh: abort (core dumped)  nv-sglrun nvidia-smi -q

Below is a nvidia-smi output without nv-sglrun wrapper:

% nvidia-smi
Mon Jul  7 04:32:51 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.64.03              Driver Version: 575.64.03      CUDA Version: N/A      |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5070        Off |   00000000:02:00.0  On |                  N/A |
|  0%   39C    P0             29W /  250W |    1731MiB /  12227MiB |     25%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A           16578      G   /usr/local/libexec/Xorg                 498MiB |
|    0   N/A  N/A           18826      G   /usr/local/bin/enlightenment            415MiB |
|    0   N/A  N/A           25940      G   /usr/local/bin/firefox                  343MiB |
|    0   N/A  N/A           43452      G   ...l/lib/virtualbox/VirtualBoxVM        379MiB |
+-----------------------------------------------------------------------------------------+


hexagon% nvidia-smi -q

==============NVSMI LOG==============

Timestamp                                 : Mon Jul  7 04:33:00 2025
Driver Version                            : 575.64.03
CUDA Version                              : Not Found

Attached GPUs                             : 1
GPU 00000000:02:00.0
    Product Name                          : NVIDIA GeForce RTX 5070
    Product Brand                         : GeForce
    Product Architecture                  : Blackwell
    Display Mode                          : Requested functionality has been deprecated
    Display Attached                      : Yes
    Display Active                        : Enabled
    Persistence Mode                      : Disabled
    Addressing Mode                       : N/A
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : 0
    GPU UUID                              : GPU-ce9d59d9-13aa-acd6-8ef9-58c4fe9408fc
    Minor Number                          : 0
    VBIOS Version                         : 98.05.28.00.65
    MultiGPU Board                        : No
    Board ID                              : 0x200
    Board Part Number                     : N/A
    GPU Part Number                       : 2F04-300-A1
    FRU Part Number                       : N/A
    Platform Info
        Chassis Serial Number             :
        Slot Number                       : 0
        Tray Index                        : 0
        Host ID                           : 1
        Peer Type                         : Direct Connected
        Module Id                         : 1
        GPU Fabric GUID                   : 0x0000000000000000
    Inforom Version
        Image Version                     : G005.0000.98.01
        OEM Object                        : 2.1
        ECC Object                        : N/A
        Power Management Object           : N/A
    Inforom BBX Object Flush
        Latest Timestamp                  : N/A
        Latest Duration                   : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GPU C2C Mode                          : Disabled
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
        vGPU Heterogeneous Mode           : N/A
    GPU Reset Status
        Reset Required                    : Requested functionality has been deprecated
        Drain and Reset Recommended       : Requested functionality has been deprecated
    GPU Recovery Action                   : None
    GSP Firmware Version                  : 575.64.03
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : Unknown Error
        Device                            : Unknown Error
        Domain                            : Unknown Error
        Base Classcode                    : Unknown Error
        Sub Classcode                     : Unknown Error
        Device Id                         : Unknown Error
        Bus Id                            : Unknown Error
        Sub System Id                     : Unknown Error
        GPU Link Info
            PCIe Generation
                Max                       : 5
                Current                   : 5
                Device Current            : 5
                Device Max                : 5
                Host Max                  : 5
            Link Width
                Max                       : 16x
                Current                   : 8x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 2318 KB/s
        Rx Throughput                     : 3641 KB/s
        Atomic Caps Outbound              : N/A
        Atomic Caps Inbound               : N/A
    Fan Speed                             : 0 %
    Performance State                     : P0
    Clocks Event Reasons
        Idle                              : Not Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    Clocks Event Reasons Counters
        SW Power Capping                  : 613738145 us
        Sync Boost                        : 0 us
        SW Thermal Slowdown               : 0 us
        HW Thermal Slowdown               : 0 us
        HW Power Braking                  : 0 us
    Sparse Operation Mode                 : N/A
    FB Memory Usage
        Total                             : 12227 MiB
        Reserved                          : 464 MiB
        Used                              : 1731 MiB
        Free                              : 10033 MiB
    BAR1 Memory Usage
        Total                             : 16384 MiB
        Used                              : 15 MiB
        Free                              : 16369 MiB
    Conf Compute Protected Memory Usage
        Total                             : 0 MiB
        Used                              : 0 MiB
        Free                              : 0 MiB
    Compute Mode                          : Default
    Utilization
        GPU                               : 9 %
        Memory                            : 1 %
        Encoder                           : 0 %
        Decoder                           : 0 %
        JPEG                              : 0 %
        OFA                               : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    DRAM Encryption Mode
        Current                           : Disabled
        Pending                           : Disabled
    ECC Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            SRAM Correctable              : N/A
            SRAM Uncorrectable Parity     : N/A
            SRAM Uncorrectable SEC-DED    : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
        Aggregate
            SRAM Correctable              : N/A
            SRAM Uncorrectable Parity     : N/A
            SRAM Uncorrectable SEC-DED    : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
            SRAM Threshold Exceeded       : N/A
        Aggregate Uncorrectable SRAM Sources
            SRAM L2                       : N/A
            SRAM SM                       : N/A
            SRAM Microcontroller          : N/A
            SRAM PCIE                     : N/A
            SRAM Other                    : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows
        Correctable Error                 : 0
        Uncorrectable Error               : 0
        Pending                           : No
        Remapping Failure Occurred        : No
        Bank Remap Availability Histogram
            Max                           : 192 bank(s)
            High                          : 0 bank(s)
            Partial                       : 0 bank(s)
            Low                           : 0 bank(s)
            None                          : 0 bank(s)
    Temperature
        GPU Current Temp                  : 39 C
        GPU T.Limit Temp                  : 46 C
        GPU Shutdown T.Limit Temp         : -5 C
        GPU Slowdown T.Limit Temp         : -2 C
        GPU Max Operating T.Limit Temp    : 0 C
        GPU Target Temperature            : N/A
        Memory Current Temp               : N/A
        Memory Max Operating T.Limit Temp : N/A
    GPU Power Readings
        Average Power Draw                : 30.16 W
        Instantaneous Power Draw          : 31.19 W
        Current Power Limit               : 250.00 W
        Requested Power Limit             : 250.00 W
        Default Power Limit               : 250.00 W
        Min Power Limit                   : 175.00 W
        Max Power Limit                   : 300.00 W
    GPU Memory Power Readings
        Average Power Draw                : N/A
        Instantaneous Power Draw          : N/A
    Module Power Readings
        Average Power Draw                : N/A
        Instantaneous Power Draw          : N/A
        Current Power Limit               : N/A
        Requested Power Limit             : N/A
        Default Power Limit               : N/A
        Min Power Limit                   : N/A
        Max Power Limit                   : N/A
    Power Smoothing                       : N/A
    Workload Power Profiles
        Requested Profiles                : N/A
        Enforced Profiles                 : N/A
    Clocks
        Graphics                          : 1260 MHz
        SM                                : 1260 MHz
        Memory                            : 14001 MHz
        Video                             : 1672 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Deferred Clocks
        Memory                            : N/A
    Max Clocks
        Graphics                          : 3105 MHz
        SM                                : 3105 MHz
        Memory                            : 14001 MHz
        Video                             : 3090 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : Requested functionality has been deprecated
    Fabric
        State                             : N/A
        Status                            : N/A
        CliqueId                          : N/A
        ClusterUUID                       : N/A
        Health
            Bandwidth                     : N/A
            Route Recovery in progress    : N/A
            Route Unhealthy               : N/A
            Access Timeout Recovery       : N/A
    Processes
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 16578
            Type                          : G
            Name                          : /usr/local/libexec/Xorg
            Used GPU Memory               : 498 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 18826
            Type                          : G
            Name                          : /usr/local/bin/enlightenment
            Used GPU Memory               : 415 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 25940
            Type                          : G
            Name                          : /usr/local/bin/firefox
            Used GPU Memory               : 343 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 43452
            Type                          : G
            Name                          : /usr/local/lib/virtualbox/VirtualBoxVM
            Used GPU Memory               : 379 MiB
    Capabilities
        EGM      

cederom avatar Jul 07 '25 02:07 cederom