cuda-samples icon indicating copy to clipboard operation
cuda-samples copied to clipboard

NVRM: GPU 0000:13:00.0: RmInitAdapter failed! (0x26:0x56:1463)

Open chaiyd opened this issue 2 years ago • 6 comments

install nvidia-driver version 510.47.03 on Debian11 , Not working .

curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/7fa2af80.pub | apt-key add -
echo "deb https://mirrors.aliyun.com/nvidia-cuda/debian11/x86_64/ /" > /etc/apt/sources.list.d/cuda.list

apt update && apt install nvidia-driver 

error info

Feb 09 15:47:05 debian kernel: ACPI Warning: \_SB.PCI0.PE60.S1F0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20200925/nsarguments-61)
Feb 09 15:47:06 debian kernel: NVRM: GPU 0000:13:00.0: RmInitAdapter failed! (0x26:0x56:1463)
Feb 09 15:47:06 debian kernel: BUG: unable to handle page fault for address: 0000000000002a04
Feb 09 15:47:06 debian kernel: #PF: supervisor read access in kernel mode
Feb 09 15:47:06 debian kernel: #PF: error_code(0x0000) - not-present page
Feb 09 15:47:06 debian kernel: PGD 0 P4D 0
Feb 09 15:47:06 debian kernel: Oops: 0000 [#1] SMP NOPTI
Feb 09 15:47:06 debian kernel: NVRM: GPU 0000:13:00.0: rm_init_adapter failed, device minor number 0
Feb 09 15:47:06 debian kernel: CPU: 15 PID: 591 Comm: nv_queue Tainted: P           OE     5.10.0-11-amd64 #1 Debian 5.10.92-1
Feb 09 15:47:06 debian kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Feb 09 15:47:06 debian kernel: RIP: 0010:_nv009917rm+0x38/0xc0 [nvidia]
Feb 09 15:47:06 debian kernel: Code: 49 f1 01 48 8b bb 68 01 00 00 e8 d3 55 4d 00 85 c0 74 0f 48 83 c4 08 5b 41 5c c3 0f 1f 80 00 00 00 00 44 89 e7 e8 38 0b be ff <8b> 90 04 2a 00 00 83 fa 01 74 2f 80 b8 0c 05 00 00 00 74 12 80 b8
Feb 09 15:47:06 debian kernel: RSP: 0018:ffff9b5200adbde0 EFLAGS: 00010246
Feb 09 15:47:06 debian kernel: RAX: 0000000000000000 RBX: ffff8ad080fbf408 RCX: 0000000000000000
Feb 09 15:47:06 debian kernel: RDX: ffff9b5200adbe0c RSI: 0000000000000000 RDI: 0000000000000000
Feb 09 15:47:06 debian kernel: RBP: ffff8ad1035e6000 R08: 0000000000003000 R09: 0000000000000000
Feb 09 15:47:06 debian kernel: R10: ffff9b5200adbe10 R11: c00c03adbfbfe201 R12: 0000000000000000
Feb 09 15:47:06 debian kernel: R13: ffff8ad1035e3000 R14: ffff8ad1028b4708 R15: 0000000000000246
Feb 09 15:47:06 debian kernel: FS:  0000000000000000(0000) GS:ffff8ad79e1c0000(0000) knlGS:0000000000000000
Feb 09 15:47:06 debian kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 09 15:47:06 debian kernel: CR2: 0000000000002a04 CR3: 000000010230a004 CR4: 00000000007706e0
Feb 09 15:47:06 debian kernel: PKRU: 55555554
Feb 09 15:47:06 debian kernel: Call Trace:
Feb 09 15:47:06 debian kernel:  ? rm_execute_work_item+0x108/0x120 [nvidia]
Feb 09 15:47:06 debian kernel:  ? newidle_balance+0x1d3/0x3c0
Feb 09 15:47:06 debian kernel:  ? os_execute_work_item+0x46/0x60 [nvidia]
Feb 09 15:47:06 debian kernel:  ? _main_loop+0x9e/0x150 [nvidia]
Feb 09 15:47:06 debian kernel:  ? nvidia_modeset_resume+0x20/0x20 [nvidia]
Feb 09 15:47:06 debian kernel:  ? kthread+0x11b/0x140
Feb 09 15:47:06 debian kernel:  ? __kthread_bind_mask+0x60/0x60
Feb 09 15:47:06 debian kernel:  ? ret_from_fork+0x1f/0x30
Feb 09 15:47:06 debian kernel: Modules linked in: skx_edac(-) nfit libnvdimm ghash_clmulni_intel aesni_intel libaes crypto_simd cryptd glue_helper vsock_loopback vmw_vsock_virtio_transport_common rapl vmw_vsock_vmci_transport vsock vmwgfx nvidia_drm(POE) ttm drm_kms_helper vmw_balloon nvidia_modeset(POE) joydev sg serio_raw vmw_vmci cec pcspkr evdev button ac nvidia(POE) drm fuse configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic sd_mod t10_pi crc_t10dif crct10dif_generic ata_generic ata_piix libata crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel vmw_pvscsi psmouse scsi_mod vmxnet3 i2c_piix4
Feb 09 15:47:06 debian kernel: CR2: 0000000000002a04
Feb 09 15:47:06 debian kernel: ---[ end trace af86ac5b7d3fea4b ]---
Feb 09 15:47:06 debian kernel: RIP: 0010:_nv009917rm+0x38/0xc0 [nvidia]
Feb 09 15:47:06 debian kernel: Code: 49 f1 01 48 8b bb 68 01 00 00 e8 d3 55 4d 00 85 c0 74 0f 48 83 c4 08 5b 41 5c c3 0f 1f 80 00 00 00 00 44 89 e7 e8 38 0b be ff <8b> 90 04 2a 00 00 83 fa 01 74 2f 80 b8 0c 05 00 00 00 74 12 80 b8
Feb 09 15:47:06 debian kernel: RSP: 0018:ffff9b5200adbde0 EFLAGS: 00010246
Feb 09 15:47:06 debian kernel: RAX: 0000000000000000 RBX: ffff8ad080fbf408 RCX: 0000000000000000
Feb 09 15:47:06 debian kernel: RDX: ffff9b5200adbe0c RSI: 0000000000000000 RDI: 0000000000000000
Feb 09 15:47:06 debian kernel: RBP: ffff8ad1035e6000 R08: 0000000000003000 R09: 0000000000000000
Feb 09 15:47:06 debian kernel: R10: ffff9b5200adbe10 R11: c00c03adbfbfe201 R12: 0000000000000000
Feb 09 15:47:06 debian kernel: R13: ffff8ad1035e3000 R14: ffff8ad1028b4708 R15: 0000000000000246
Feb 09 15:47:06 debian kernel: FS:  0000000000000000(0000) GS:ffff8ad79e1c0000(0000) knlGS:0000000000000000
Feb 09 15:47:06 debian kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 09 15:47:06 debian kernel: CR2: 0000000000002a04 CR3: 000000010230a004 CR4: 00000000007706e0
Feb 09 15:47:06 debian kernel: PKRU: 55555554

chaiyd avatar Feb 09 '22 08:02 chaiyd

I met this problem for several days. I tried Ubuntu 22.04 and 20.04 with NVIDIA driver 515, 510, 495, 470, 465 and 460. All combinations not worked. Anyone know why? Both ESXi 7.0u3 and 7.0u2 have been tested.

sxlllslgh avatar May 22 '22 13:05 sxlllslgh

Same here, nvidia-dkms 510.73.08-0ubuntu1 and some ubuntu 5.17 kernel. GeForce RTX 2080.

kalvdans avatar Jul 22 '22 14:07 kalvdans

same here,

I met this problem for several days. I tried Ubuntu 22.04 and 20.04 with NVIDIA driver 515, 510, 495, 470, 465 and 460. All combinations not worked. Anyone know why? Both ESXi 7.0u3 and 7.0u2 have been tested.

can you fix thsi problem? I meet this problem too.

1104662797 avatar Oct 08 '22 02:10 1104662797

@1104662797 Have you tried this on your VM? https://github.com/aperim/docker-nvidia-cuda-ffmpeg#getting-it-working

troykelly avatar Oct 08 '22 05:10 troykelly

I have the same issue on Fedora 37 Dell Precision Mobile 5540 but only with 525 drivers, not the 520 or below:

https://forums.developer.nvidia.com/t/rminitadapter-failed-0x241423-with-525-78-01-but-not-520-56-06-on-fedora-37/239876

arcivanov avatar Feb 15 '23 03:02 arcivanov

I had it on Ubuntu 22.04, and tried 535,515 on R730 Poweredge -Tesla P100. Any updates?

cemdede avatar Aug 07 '23 03:08 cemdede