`nvidia-smi` fails with `No devices were found` on RTX 5090 / GB202
NVIDIA Open GPU Kernel Modules Version
570.153.02
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.
- [x] I confirm that this does not happen with the proprietary driver package.
Operating System and Version
Arch Linux
Kernel Release
6.14.7-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 22 May 2025 05:37:49 +0000 x86_64 GNU/Linux
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
- [x] I am running on a stable kernel release.
Hardware: GPU
01:00.0 VGA compatible controller: NVIDIA Corporation GB202 [GeForce RTX 5090] (rev a1) (as nvidia-smi fails I copied the output of lspci)
Describe the bug
When I execute nvidia-smi command without any arguments, it fails with No devices were found error after roughly 5 seconds of delay.
To Reproduce
- run
nvidia-smion a shell
Bug Incidence
Always
nvidia-bug-report.log.gz
More Info
pretty sure that the information below are included in nvidia-bug-report.log.gz but here's what I've gathered so far:
dmesg:
$ dmesg -H | grep -iE 'nvrm|gsp|xid'
[ +8.679149] NVRM: GPU at PCI:0000:01:00: GPU-343788c6-573a-6251-02ae-4287220a092b
[ +0.000003] NVRM: Xid (PCI:0000:01:00): 143, Error status 0x65 while polling for FSP boot complete, 0x13, 0x56, 0x0, 0x0, 0x2
[ +0.000005] NVRM: nvCheckOkFailedNoLog: Check failed: Call timed out [NV_ERR_TIMEOUT] (0x00000065) returned from kgspWaitForGfwBootOk_HAL(pGpu, pKernelGsp) @ kernel_gsp.c:3676
[ +0.000028] NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ +0.001812] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x62:0x65:1859)
[ +0.001485] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
lsmod:
$ lsmod | grep -iE 'nvidia|nouv'
nvidia_drm 139264 0
nvidia_modeset 2158592 1 nvidia_drm
drm_ttm_helper 16384 1 nvidia_drm
nvidia_uvm 3940352 0
nvidia 13119488 2 nvidia_uvm,nvidia_modeset
video 81920 1 nvidia_modeset
modinfo:
$ modinfo nvidia
filename: /lib/modules/6.14.7-arch2-1/extramodules/nvidia.ko.zst
import_ns: DMA_BUF
alias: char-major-195-*
version: 570.153.02
supported: external
license: Dual MIT/GPL
firmware: nvidia/570.153.02/gsp_tu10x.bin
firmware: nvidia/570.153.02/gsp_ga10x.bin
srcversion: 9C27A8B290453A7640E09FB
alias: pci:v000010DEd*sv*sd*bc06sc80i00*
alias: pci:v000010DEd*sv*sd*bc03sc02i00*
alias: pci:v000010DEd*sv*sd*bc03sc00i00*
depends:
name: nvidia
retpoline: Y
vermagic: 6.14.7-arch2-1 SMP preempt mod_unload
parm: NvSwitchRegDwords:NvSwitch regkey (charp)
parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp)
parm: NVreg_ResmanDebugLevel:int
parm: NVreg_RmLogonRC:int
parm: NVreg_ModifyDeviceFiles:int
parm: NVreg_DeviceFileUID:int
parm: NVreg_DeviceFileGID:int
parm: NVreg_DeviceFileMode:int
parm: NVreg_InitializeSystemMemoryAllocations:int
parm: NVreg_UsePageAttributeTable:int
parm: NVreg_EnablePCIeGen3:int
parm: NVreg_EnableMSI:int
parm: NVreg_EnableStreamMemOPs:int
parm: NVreg_RestrictProfilingToAdminUsers:int
parm: NVreg_PreserveVideoMemoryAllocations:int
parm: NVreg_EnableS0ixPowerManagement:int
parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int
parm: NVreg_DynamicPowerManagement:int
parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int
parm: NVreg_EnableGpuFirmware:int
parm: NVreg_EnableGpuFirmwareLogs:int
parm: NVreg_OpenRmEnableUnsupportedGpus:int
parm: NVreg_EnableUserNUMAManagement:int
parm: NVreg_MemoryPoolSize:int
parm: NVreg_KMallocHeapMaxSize:int
parm: NVreg_VMallocHeapMaxSize:int
parm: NVreg_IgnoreMMIOCheck:int
parm: NVreg_NvLinkDisable:int
parm: NVreg_EnablePCIERelaxedOrderingMode:int
parm: NVreg_RegisterPCIDriver:int
parm: NVreg_EnableResizableBar:int
parm: NVreg_EnableDbgBreakpoint:int
parm: NVreg_EnableNonblockingOpen:int
parm: NVreg_RegistryDwords:charp
parm: NVreg_RegistryDwordsPerDevice:charp
parm: NVreg_RmMsg:charp
parm: NVreg_GpuBlacklist:charp
parm: NVreg_TemporaryFilePath:charp
parm: NVreg_ExcludedGpus:charp
parm: NVreg_DmaRemapPeerMmio:int
parm: NVreg_RmNvlinkBandwidth:charp
parm: NVreg_RmNvlinkBandwidthLinkCount:int
parm: NVreg_ImexChannelCount:int
parm: NVreg_CreateImexChannel0:int
parm: NVreg_GrdmaPciTopoCheckOverride:int
parm: rm_firmware_active:charp
Not sure if it's related, but I installed the latest version of linux-firmware (specifically commit 3fbaee27) from kernel.org so I've got a few files under /usr/lib/firmware/nvidia/gb202/gsp/:
$ ls -lh /usr/lib/firmware/nvidia/gb202/gsp/
total 392K
-rw-r--r-- 1 root root 195K May 24 03:14 bootloader-570.144.bin.zst
-rw-r--r-- 1 root root 196K May 24 03:14 fmc-570.144.bin.zst
lrwxrwxrwx 1 root root 35 May 24 03:14 gsp-570.144.bin.zst -> ../../ga102/gsp/gsp-570.144.bin.zst
I also tested with several other combinations of driver and kernel versions, but none of them worked as expected (some of them resulted in different error though).
| driver / kernel | 570.133.07 | 570.144 | 570.153.02 | 575.51.02 |
|---|---|---|---|---|
| 6.14.4 | NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. |
not tested | not tested | not tested |
| 6.14.6 | not tested | No devices were found |
not tested | No devices were found |
| 6.14.7 | not tested | NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. |
No devices were found. (the reported setup) |
No devices were found |
@ikr7 what is the device id of your gpu? Please post the output of lspci -vs 01:00.0
The only IDs supported by the open source driver so far are 2B85, 2B87 and 2C18, 2C58 for notebooks:
src/nvidia/generated/g_nv_name_released.h:5417: { 0x2B85, 0x0000, 0x0000, "NVIDIA GeForce RTX 5090" },
src/nvidia/generated/g_nv_name_released.h:5418: { 0x2B87, 0x0000, 0x0000, "NVIDIA GeForce RTX 5090 D" },
src/nvidia/generated/g_nv_name_released.h:5429: { 0x2C18, 0x0000, 0x0000, "NVIDIA GeForce RTX 5090 Laptop GPU" },
src/nvidia/generated/g_nv_name_released.h:5431: { 0x2C58, 0x0000, 0x0000, "NVIDIA GeForce RTX 5090 Laptop GPU" },
@ikr7 what is the device id of your gpu? Please post the output of
lspci -vs 01:00.0The only IDs supported by the open source driver so far are
2B85, 2B87and2C18, 2C58for notebooks:src/nvidia/generated/g_nv_name_released.h:5417: { 0x2B85, 0x0000, 0x0000, "NVIDIA GeForce RTX 5090" }, src/nvidia/generated/g_nv_name_released.h:5418: { 0x2B87, 0x0000, 0x0000, "NVIDIA GeForce RTX 5090 D" }, src/nvidia/generated/g_nv_name_released.h:5429: { 0x2C18, 0x0000, 0x0000, "NVIDIA GeForce RTX 5090 Laptop GPU" }, src/nvidia/generated/g_nv_name_released.h:5431: { 0x2C58, 0x0000, 0x0000, "NVIDIA GeForce RTX 5090 Laptop GPU" },
That does not show pci id, do lspci -vnns 01:00.0 instead
@elsaco @foxwhite25
Here's the output of lspci -vnns 01:00.0 (ran as root); seems the card should be supported by the driver.
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GB202 [GeForce RTX 5090] [10de:2b85] (rev a1) (prog-if 00 [VGA controller])
Subsystem: ZOTAC International (MCO) Ltd. Device [19da:1761]
Flags: bus master, fast devsel, latency 0, IRQ 68
Memory at f8000000 (32-bit, non-prefetchable) [size=64M]
Memory at d0000000 (64-bit, prefetchable) [size=256M]
Memory at e0000000 (64-bit, prefetchable) [size=32M]
I/O ports at f000 [size=128]
Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Capabilities: [48] MSI: Enable- Count=1/16 Maskable+ 64bit+
Capabilities: [60] Express Legacy Endpoint, IntMsgNum 0
Capabilities: [9c] Vendor Specific Information: Len=14 <?>
Capabilities: [b0] MSI-X: Enable- Count=9 Masked-
Capabilities: [100] Secondary PCI Express
Capabilities: [12c] Latency Tolerance Reporting
Capabilities: [134] Physical Resizable BAR
Capabilities: [140] Virtual Resizable BAR
Capabilities: [14c] Data Link Feature <?>
Capabilities: [158] Physical Layer 16.0 GT/s <?>
Capabilities: [188] Physical Layer 32.0 GT/s <?>
Capabilities: [1b8] Advanced Error Reporting
Capabilities: [200] Lane Margining at the Receiver
Capabilities: [248] Alternative Routing-ID Interpretation (ARI)
Capabilities: [250] Single Root I/O Virtualization (SR-IOV)
Capabilities: [290] L1 PM Substates
Capabilities: [2a4] Vendor Specific Information: ID=0001 Rev=1 Len=014 <?>
Capabilities: [2bc] Power Budgeting <?>
Capabilities: [2f4] Device Serial Number d2-ba-fa-82-95-2d-b0-48
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia_drm, nvidia
Have you activated both Resizable BAR and Above 4G Decoding in the BIOS?
While those settings were deactivated (default setting in my BIOS) I saw the same results as you:
# nvidia-smi
No devices were found
# meanwhile, output in /var/log/syslog
Jul 01 22:55:07 gpu kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
Jul 01 22:55:07 gpu kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x22:0x56:884)
Jul 01 22:55:07 gpu kernel: NVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:2b85)
NVRM: installed in this system requires use of the NVIDIA open kernel modules.
After activating resizable bar and above 4g decoding, I've installed the driver as follows:
# from https://www.nvidia.com/de-de/drivers/details/250991/
wget "https://us.download.nvidia.com/XFree86/Linux-x86_64/575.64.05/NVIDIA-Linux-x86_64-575.64.05.run" -O "NVIDIA-Linux-x86_64-575.64.05.run"
chmod +x NVIDIA-Linux-x86_64-575.64.05.run
./NVIDIA-Linux-x86_64-575.64.05.run
# choose "MIT/GPL" for the open driver!
Furthermore I've done this: nano /etc/default/grub and edited this line:
GRUB_CMDLINE_LINUX_DEFAULT="quiet pci=realloc" (before it was just GRUB_CMDLINE_LINUX_DEFAULT="quiet")
and do a update-grub afterwards and reboot. Not sure anymore if the grub change was needed though.
Anyway this has worked for me.
nvidia-smi
Sat Jul 26 15:18:39 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.64.05 Driver Version: 575.64.05 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 5090 On | 00000000:0A:00.0 Off | N/A |
| 30% 42C P8 21W / 400W | 0MiB / 32607MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Edit: updated driver recommendation
Thanks for your reply. However, my card also stopped outputting video, so I sent it back to the manufacturer for repair. I'll update you with more information once it's returned.
Have you activated both Resizable BAR and Above 4G Decoding in the BIOS?
While those settings were deactivated (default setting in my BIOS) I saw the same results as you:
# nvidia-smi No devices were found # meanwhile, output in /var/log/syslog Jul 01 22:55:07 gpu kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 Jul 01 22:55:07 gpu kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x22:0x56:884) Jul 01 22:55:07 gpu kernel: NVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:2b85) NVRM: installed in this system requires use of the NVIDIA open kernel modules.After activating resizable bar and above 4g decoding, I've installed the driver as follows:
# from https://www.nvidia.com/de-de/drivers/details/250991/ wget "https://us.download.nvidia.com/XFree86/Linux-x86_64/575.64.05/NVIDIA-Linux-x86_64-575.64.05.run" -O "NVIDIA-Linux-x86_64-575.64.05.run" chmod +x NVIDIA-Linux-x86_64-575.64.05.run ./NVIDIA-Linux-x86_64-575.64.05.run # choose "MIT/GPL" for the open driver!Furthermore I've done this:
nano /etc/default/gruband edited this line:GRUB_CMDLINE_LINUX_DEFAULT="quiet pci=realloc"(before it was justGRUB_CMDLINE_LINUX_DEFAULT="quiet") and do aupdate-grubafterwards and reboot. Not sure anymore if the grub change was needed though.Anyway this has worked for me.
nvidia-smi Sat Jul 26 15:18:39 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 575.64.05 Driver Version: 575.64.05 CUDA Version: 12.9 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 5090 On | 00000000:0A:00.0 Off | N/A | | 30% 42C P8 21W / 400W | 0MiB / 32607MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+Edit: updated driver recommendation
Thanks a lot!!!!!!!!! Your solution works for my case.
@cyril23 thank you for https://github.com/NVIDIA/open-gpu-kernel-modules/issues/862#issuecomment-3121487185, installing it with https://www.nvidia.com/en-us/drivers/details/253003/, in debain 12.12 and
choose "MIT/GPL" for the open driver is the solution for nvidia-smi for NVIDIA GeForce RTX 5050 available with Lenovo YOGA AURA edition intel9.
Got my cards fixed back from the manufacturer and it worked flawlessly, meaning that was a hardware malfunction. Closing the issue. Thanks y'all for the helpful comments.