qubes-issues icon indicating copy to clipboard operation
qubes-issues copied to clipboard

Intermittent PCI errors after 4.1 upgrade: sys-usb fails to restart due to "Unable to reset PCI device"

Open tetrahedras opened this issue 3 years ago • 5 comments

Release 4.1

My sys-usb should start on boot. Often it does not. When I manually try to start it, it sometimes (but not always) produces the error:

$ qvm-start sys-usb
Start failed: internal error: Unable to reset PCI device 0000:00:14.0: internal error: libxenlight failed to create new domain 'disp-sys-usb', see /var/log/libvirt/libxl/libxl-driver.log for details

After seeing this error, if I simply repeat the command qvm-start sys-usb then it starts fine.

libxl-driver.log contains the following relevant lines:

2022-05-04 15:41:49.913+0000: libxl: libxl_create.c:1904:domcreate_attach_devices: Domain 3:unable to add pci devices
2022-05-04 15:41:50.343+0000: libxl: libxl_pci.c:1489:libxl__device_pci_reset: The kernel doesn't support reset from sysfs for PCI device 0000:00:14.0
2022-05-04 15:45:30.561+0000: libxl: libxl_pci.c:1489:libxl__device_pci_reset: The kernel doesn't support reset from sysfs for PCI device 0000:00:14.0
2022-05-04 15:45:42.699+0000: libxl: libxl_pci.c:1774:device_pci_add_done: Domain 10:libxl__device_pci_add  failed for PCI device 0:0:14.0 (rc -9)
2022-05-04 15:45:42.710+0000: libxl: libxl_pci.c:1774:device_pci_add_done: Domain 10:libxl__device_pci_add  failed for PCI device 0:0:1a.0 (rc -9)
2022-05-04 15:45:42.710+0000: libxl: libxl_create.c:1904:domcreate_attach_devices: Domain 10:unable to add pci devices
2022-05-04 15:45:43.136+0000: libxl: libxl_pci.c:1489:libxl__device_pci_reset: The kernel doesn't support reset from sysfs for PCI device 0000:00:14.0
2022-05-04 15:46:02.753+0000: libxl: libxl_pci.c:1489:libxl__device_pci_reset: The kernel doesn't support reset from sysfs for PCI device 0000:00:14.0

I'm not sure which device this is. There is no 0000:00:14.0 entry in qvm-pci (dom0) or lspci (sys-usb).

Since this looks like a kernel support issue, I will try rolling back the kernel for sys-usb from 5.10 (current) to 5.4 and see if the issue persists.

tetrahedras avatar May 04 '22 16:05 tetrahedras

Confirmed rolling back the sys-usb kernel to 5.4 fixed this... so far. Since this is an intermittent issue, the kernel may have nothing to do with it, and the issue may recur later.

tetrahedras avatar May 04 '22 16:05 tetrahedras

2022-05-04 15:45:42.710+0000: libxl: libxl_create.c:1904:domcreate_attach_devices: Domain 10:unable to add pci devices

This is happening before VM kernel is even started, so its version is unlikely to have impact here. Do you have all updates installed? Especially, do you have xen-libs-4.14.4 or newer in dom0?

marmarek avatar May 04 '22 17:05 marmarek

Do you have all updates installed? Especially, do you have xen-libs-4.14.4 or newer in dom0?

Maybe not, due to #7503

tetrahedras avatar May 09 '22 17:05 tetrahedras

I can confirm that this is happening on a recent Qubes 4.1.1 installation.

@marmarek:

Do you have all updates installed? Especially, do you have xen-libs-4.14.4 or newer in dom0?

Yes, and I have xen-libs-4.14.5-6.fc32.

andrewdavidwong avatar Jul 31 '22 02:07 andrewdavidwong

Possibly related: #6824

andrewdavidwong avatar Jul 31 '22 03:07 andrewdavidwong