Linux - Task sometimes blocks when removing virtio device or running lspci
Sometimes screen output with virtio gets stuck, and apps start complaining they can't output via opengl.
I can usually get things going by removing the virtio PCI device, then rescanning, then logging back into my desktop - (an improvement over rebooting the vm)
lspci
[snipped]
00:03.0 Display controller: Red Hat, Inc. Virtio GPU (rev 01)
[snipped]
echo 1 > /sys/bus/pci/devices/0000\:00\:03.0/remove
sleep 1
echo "1" > /sys/bus/pci/rescan
killall gnome-session
Sometimes either the command to remove the device, or lspci end up blocking.
Last time this blocked on the first echo, I checked journalctl and it had these messages:
❯ journalctl -b -k -f
Jan 08 18:31:15 ubuntu-vm kernel: __schedule+0x33c/0x838
Jan 08 18:31:15 ubuntu-vm kernel: schedule+0x68/0x168
Jan 08 18:31:15 ubuntu-vm kernel: virtio_gpu_queue_ctrl_sgs+0x118/0x358 [virtio_gpu]
Jan 08 18:31:15 ubuntu-vm kernel: virtio_gpu_queue_fenced_ctrl_buffer+0x208/0x270 [virtio_gpu]
Jan 08 18:31:15 ubuntu-vm kernel: virtio_gpu_cmd_get_edids+0xf4/0x1a8 [virtio_gpu]
Jan 08 18:31:15 ubuntu-vm kernel: virtio_gpu_config_changed_work_func+0x130/0x158 [virtio_gpu]
Jan 08 18:31:15 ubuntu-vm kernel: process_one_work+0x168/0x3f0
Jan 08 18:31:15 ubuntu-vm kernel: worker_thread+0x360/0x480
Jan 08 18:31:15 ubuntu-vm kernel: kthread+0xf8/0x110
Jan 08 18:31:15 ubuntu-vm kernel: ret_from_fork+0x10/0x20
Jan 08 18:33:15 ubuntu-vm kernel: INFO: task Xorg:2569 blocked for more than 241 seconds.
Jan 08 18:33:15 ubuntu-vm kernel: Not tainted 6.6.4-060604-generic #202312030734
Jan 08 18:33:15 ubuntu-vm kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 08 18:33:15 ubuntu-vm kernel: task:Xorg state:D stack:0 pid:2569 ppid:2564 flags:0x00000005
Jan 08 18:33:15 ubuntu-vm kernel: Call trace:
Jan 08 18:33:15 ubuntu-vm kernel: __switch_to+0xc0/0x108
Jan 08 18:33:15 ubuntu-vm kernel: __schedule+0x33c/0x838
Jan 08 18:33:15 ubuntu-vm kernel: schedule+0x68/0x168
Jan 08 18:33:15 ubuntu-vm kernel: virtio_gpu_queue_ctrl_sgs+0x118/0x358 [virtio_gpu]
Jan 08 18:33:15 ubuntu-vm kernel: virtio_gpu_queue_fenced_ctrl_buffer+0x208/0x270 [virtio_gpu]
Jan 08 18:33:15 ubuntu-vm kernel: virtio_gpu_cmd_transfer_to_host_2d+0xd8/0x150 [virtio_gpu]
Jan 08 18:33:15 ubuntu-vm kernel: virtio_gpu_cursor_plane_update+0x1a0/0x330 [virtio_gpu]
Jan 08 18:33:15 ubuntu-vm kernel: drm_atomic_helper_commit_planes+0x108/0x358 [drm_kms_helper]
Jan 08 18:33:15 ubuntu-vm kernel: drm_atomic_helper_commit_tail+0x60/0xd0 [drm_kms_helper]
Jan 08 18:33:15 ubuntu-vm kernel: commit_tail+0x1dc/0x268 [drm_kms_helper]
Jan 08 18:33:15 ubuntu-vm kernel: drm_atomic_helper_commit+0x1b8/0x1d8 [drm_kms_helper]
Jan 08 18:33:15 ubuntu-vm kernel: drm_atomic_commit+0xb8/0x118 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: drm_atomic_helper_update_plane+0x16c/0x1b0 [drm_kms_helper]
Jan 08 18:33:15 ubuntu-vm kernel: __setplane_atomic+0x100/0x178 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: drm_mode_cursor_universal+0x124/0x288 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: drm_mode_cursor_common+0x150/0x270 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: drm_mode_cursor2_ioctl+0x1c/0x48 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: drm_ioctl_kernel+0xf4/0x1c0 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: drm_ioctl+0x290/0x6d0 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: __arm64_sys_ioctl+0xd0/0x138
Jan 08 18:33:15 ubuntu-vm kernel: invoke_syscall+0x7c/0x128
Jan 08 18:33:15 ubuntu-vm kernel: el0_svc_common.constprop.0+0x4c/0x140
Jan 08 18:33:15 ubuntu-vm kernel: do_el0_svc+0x28/0x58
Jan 08 18:33:15 ubuntu-vm kernel: el0_svc+0x40/0x108
Jan 08 18:33:15 ubuntu-vm kernel: el0t_64_sync_handler+0x148/0x158
Jan 08 18:33:15 ubuntu-vm kernel: el0t_64_sync+0x1b0/0x1b8
Jan 08 18:33:15 ubuntu-vm kernel: INFO: task JS Helper:2914 blocked for more than 241 seconds.
Jan 08 18:33:15 ubuntu-vm kernel: Not tainted 6.6.4-060604-generic #202312030734
Jan 08 18:33:15 ubuntu-vm kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 08 18:33:15 ubuntu-vm kernel: task:JS Helper state:D stack:0 pid:2914 ppid:2500 flags:0x0000000d
Jan 08 18:33:15 ubuntu-vm kernel: Call trace:
Jan 08 18:33:15 ubuntu-vm kernel: __switch_to+0xc0/0x108
Jan 08 18:33:15 ubuntu-vm kernel: __schedule+0x33c/0x838
Jan 08 18:33:15 ubuntu-vm kernel: schedule+0x68/0x168
Jan 08 18:33:15 ubuntu-vm kernel: virtio_gpu_queue_ctrl_sgs+0x118/0x358 [virtio_gpu]
Jan 08 18:33:15 ubuntu-vm kernel: virtio_gpu_queue_fenced_ctrl_buffer+0x208/0x270 [virtio_gpu]
Jan 08 18:33:15 ubuntu-vm kernel: virtio_gpu_cmd_context_detach_resource+0x88/0xc8 [virtio_gpu]
Jan 08 18:33:15 ubuntu-vm kernel: virtio_gpu_gem_object_close+0x98/0xf0 [virtio_gpu]
Jan 08 18:33:15 ubuntu-vm kernel: drm_gem_object_release_handle+0x44/0x98 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: idr_for_each+0x74/0x118
Jan 08 18:33:15 ubuntu-vm kernel: drm_gem_release+0x34/0x70 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: drm_file_free+0x178/0x2a0 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: drm_release+0xc4/0x1a8 [drm]
Jan 08 18:33:15 ubuntu-vm kernel: __fput+0xd8/0x2c0
Jan 08 18:33:15 ubuntu-vm kernel: ____fput+0x1c/0x40
Jan 08 18:33:15 ubuntu-vm kernel: task_work_run+0x80/0x100
Jan 08 18:33:15 ubuntu-vm kernel: do_exit+0x484/0x5d8
Jan 08 18:33:15 ubuntu-vm kernel: do_group_exit+0x40/0xa8
Jan 08 18:33:15 ubuntu-vm kernel: get_signal+0x868/0x8b8
Jan 08 18:33:15 ubuntu-vm kernel: do_signal+0xa4/0x238
Jan 08 18:33:15 ubuntu-vm kernel: do_notify_resume+0x12c/0x2c0
Jan 08 18:33:15 ubuntu-vm kernel: el0_svc+0xc8/0x108
Jan 08 18:33:15 ubuntu-vm kernel: el0t_64_sync_handler+0x148/0x158
Jan 08 18:33:15 ubuntu-vm kernel: el0t_64_sync+0x1b0/0x1b8
This happens to me too, not just with virtio-gpu, but also with virtio-net. But only with Ubuntu 22.04. In particular, it doesn’t seem to happen with Ubuntu 20.04, or Fedora, or anything else.
I want to check this on upstream qemu, but there's a bug with virtio graphics on the current version https://gitlab.com/qemu-project/qemu/-/issues/2050
I don't have everything setup to build qemu master, so can't check if it's an upstream bug until the next qemu release.
Updating - this also happens on 23.04. I wonder if it's a kernel version thing.
Currently I have Linux version 6.6.4-060604-generic
Closing: I'm not longer using UTM+Qemu on a Mac and probably won't for a while, so won't be able to see if I can reproduce this.