UTM icon indicating copy to clipboard operation
UTM copied to clipboard

Ubuntu jammy 22.04 vm random freezes on Mac M1 host

Open dhammika opened this issue 2 years ago • 9 comments

Describe the issue
Ubuntu jammy 22.04 random freezes.

Configuration

  • UTM Version: 4.4.4
  • macOS Version: Sanoma 14
  • Mac Chip: M1

VM was working fine with default setting for a while but now experience random freezes. I've tried disabling gpu emulation but seems like it's happening on both virtio-ramfb, virtio-ramfb-gl cards. I haven't tried other display driver settings. On the guest I see following hangup logs,

Nov 11 13:47:33 lux gnome-shell[2615]: libinput error: event3  - QEMU QEMU USB Keyboard: client bug: event processing lagging behind by 22ms, your system is too slow
Nov 11 13:48:47 lux gnome-shell[2615]: waiting got error - 16, slow gpu or hang?
Nov 11 13:50:07 lux xdg-desktop-por[2881]: Failed to get application states: GDBus.Error:org.freedesktop.portal.Error.Failed: Could not get window list
Nov 11 13:50:54 lux gnome-shell[2615]: libinput error: event3  - QEMU QEMU USB Keyboard: client bug: event processing lagging behind by 21ms, your system is too slow
Nov 11 13:52:06 lux kernel: INFO: task gnome-shell:2615 blocked for more than 120 seconds.
Nov 11 13:52:06 lux kernel:       Not tainted 5.15.0-88-generic #98-Ubuntu
Nov 11 13:52:06 lux kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 11 13:52:06 lux kernel: task:gnome-shell     state:D stack:    0 pid: 2615 ppid:  2487 flags:0x00000004
Nov 11 13:52:06 lux kernel: Call trace:
Nov 11 13:52:06 lux kernel:  __switch_to+0xf8/0x150
Nov 11 13:52:06 lux kernel:  __schedule+0x2b8/0x790
Nov 11 13:52:06 lux kernel:  schedule+0x64/0x140
Nov 11 13:52:06 lux kernel:  virtio_gpu_queue_ctrl_sgs+0x13c/0x364 [virtio_gpu]
Nov 11 13:52:06 lux kernel:  virtio_gpu_queue_fenced_ctrl_buffer+0x108/0x21c [virtio_gpu]
Nov 11 13:52:06 lux kernel:  virtio_gpu_cmd_resource_create_3d+0xe0/0x100 [virtio_gpu]
Nov 11 13:52:06 lux kernel:  virtio_gpu_object_create+0x124/0x2ec [virtio_gpu]
Nov 11 13:52:06 lux kernel:  virtio_gpu_resource_create_ioctl+0xec/0x290 [virtio_gpu]
Nov 11 13:52:06 lux kernel:  drm_ioctl_kernel+0xcc/0x120 [drm]
Nov 11 13:52:06 lux kernel:  drm_ioctl+0x2c4/0x690 [drm]
Nov 11 13:52:06 lux kernel:  __arm64_sys_ioctl+0xb0/0xf4
Nov 11 13:52:06 lux kernel:  invoke_syscall+0x78/0x100
Nov 11 13:52:06 lux kernel:  el0_svc_common.constprop.0+0x54/0x184
Nov 11 13:52:06 lux kernel:  do_el0_svc+0x30/0xac
Nov 11 13:52:06 lux kernel:  el0_svc+0x48/0x160
Nov 11 13:52:06 lux kernel:  el0t_64_sync_handler+0xa4/0x130
Nov 11 13:52:06 lux kernel:  el0t_64_sync+0x1a4/0x1a8
Nov 11 13:52:06 lux kernel: INFO: task Xwayland:2823 blocked for more than 120 seconds.
Nov 11 13:52:06 lux kernel:       Not tainted 5.15.0-88-generic #98-Ubuntu
Nov 11 13:52:06 lux kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 11 13:52:06 lux kernel: task:Xwayland        state:D stack:    0 pid: 2823 ppid:  2615 flags:0x00000005
Nov 11 13:52:06 lux kernel: Call trace:
Nov 11 13:52:06 lux kernel:  __switch_to+0xf8/0x150
Nov 11 13:52:06 lux kernel:  __schedule+0x2b8/0x790
Nov 11 13:52:06 lux kernel:  schedule+0x64/0x140
Nov 11 13:52:06 lux kernel:  virtio_gpu_queue_ctrl_sgs+0x13c/0x364 [virtio_gpu]
Nov 11 13:52:06 lux kernel:  virtio_gpu_queue_fenced_ctrl_buffer+0x108/0x21c [virtio_gpu]
Nov 11 13:52:06 lux kernel:  virtio_gpu_cmd_resource_create_3d+0xe0/0x100 [virtio_gpu]
Nov 11 13:52:06 lux kernel:  virtio_gpu_object_create+0x124/0x2ec [virtio_gpu]
Nov 11 13:52:06 lux kernel:  virtio_gpu_resource_create_ioctl+0xec/0x290 [virtio_gpu]
Nov 11 13:52:06 lux kernel:  drm_ioctl_kernel+0xcc/0x120 [drm]
Nov 11 13:52:06 lux kernel:  drm_ioctl+0x2c4/0x690 [drm]
Nov 11 13:52:06 lux kernel:  __arm64_sys_ioctl+0xb0/0xf4
Nov 11 13:52:06 lux kernel:  invoke_syscall+0x78/0x100
Nov 11 13:52:06 lux kernel:  el0_svc_common.constprop.0+0x54/0x184
Nov 11 13:52:06 lux kernel:  do_el0_svc+0x30/0xac
Nov 11 13:52:06 lux kernel:  el0_svc+0x48/0x160
Nov 11 13:52:06 lux kernel:  el0t_64_sync_handler+0xa4/0x130
Nov 11 13:52:06 lux kernel:  el0t_64_sync+0x1a4/0x1a8

Also tried upgrading UTM and Mac OS but that didn't help.

dhammika avatar Nov 12 '23 05:11 dhammika

I wonder if it's worth taking this to the linux kernel mailing list, since you have some decent logs for it ?

EDIT: possibly worth starting on one of the virtio mailing lists ?

stuaxo avatar Nov 14 '23 20:11 stuaxo

I was using the METAL output, when I switched it seems to happen less.

I tried various outputs, the VGA one is almost instant, but the others all seem to stop from time to time as well (this on M2 macbook).

stuaxo avatar Nov 21 '23 15:11 stuaxo

Attached are the debug logs for 3 instances of the freeze on an Ubuntu 22.04 VM. Note that this is not related to #5749, which crashes the VM outright; this one, the VM is still responsive via SSH and console, just not via the UI.

cky avatar Nov 22 '23 22:11 cky

Another trace from 5.15.0-89 LTS

 956 Nov 27 15:56:22 lux kernel: INFO: task gnome-shell:2701 blocked for more than 120 seconds.
 957 Nov 27 15:56:22 lux kernel:       Not tainted 5.15.0-89-generic #99-Ubuntu
 958 Nov 27 15:56:22 lux kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 959 Nov 27 15:56:22 lux kernel: task:gnome-shell     state:D stack:    0 pid: 2701 ppid:  2579 flags:0x00000004
 960 Nov 27 15:56:22 lux kernel: Call trace:
 961 Nov 27 15:56:22 lux kernel:  __switch_to+0xf8/0x150
 962 Nov 27 15:56:22 lux kernel:  __schedule+0x2b8/0x790
 963 Nov 27 15:56:22 lux kernel:  schedule+0x64/0x140
 964 Nov 27 15:56:22 lux kernel:  virtio_gpu_queue_ctrl_sgs+0x13c/0x364 [virtio_gpu]
 965 Nov 27 15:56:22 lux kernel:  virtio_gpu_queue_fenced_ctrl_buffer+0x108/0x21c [virtio_gpu]
 966 Nov 27 15:56:22 lux kernel:  virtio_gpu_cmd_create_resource+0x9c/0xc0 [virtio_gpu]
 967 Nov 27 15:56:22 lux kernel:  virtio_gpu_object_create+0x22c/0x2ec [virtio_gpu]
 968 Nov 27 15:56:22 lux kernel:  virtio_gpu_mode_dumb_create+0xc4/0x1f0 [virtio_gpu]
 969 Nov 27 15:56:22 lux kernel:  drm_mode_create_dumb+0x98/0xb4 [drm]
 970 Nov 27 15:56:22 lux kernel:  drm_mode_create_dumb_ioctl+0x18/0x2c [drm]
 971 Nov 27 15:56:22 lux kernel:  drm_ioctl_kernel+0xcc/0x120 [drm]
 972 Nov 27 15:56:22 lux kernel:  drm_ioctl+0x2c4/0x690 [drm]
 973 Nov 27 15:56:22 lux kernel:  __arm64_sys_ioctl+0xb0/0xf4
 974 Nov 27 15:56:22 lux kernel:  invoke_syscall+0x78/0x100
 975 Nov 27 15:56:22 lux kernel:  el0_svc_common.constprop.0+0x54/0x184
 976 Nov 27 15:56:22 lux kernel:  do_el0_svc+0x30/0xac
 977 Nov 27 15:56:22 lux kernel:  el0_svc+0x48/0x160
 978 Nov 27 15:56:22 lux kernel:  el0t_64_sync_handler+0xa4/0x130
 979 Nov 27 15:56:22 lux kernel:  el0t_64_sync+0x1a4/0x1a8
 980 Nov 27 15:56:22 lux kernel: INFO: task kworker/u12:1:35798 blocked for more than 120 seconds.
 981 Nov 27 15:56:22 lux kernel:       Not tainted 5.15.0-89-generic #99-Ubuntu
 982 Nov 27 15:56:22 lux kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 983 Nov 27 15:56:22 lux kernel: task:kworker/u12:1   state:D stack:    0 pid:35798 ppid:     2 flags:0x00000008
 984 Nov 27 15:56:22 lux kernel: Workqueue: events_unbound commit_work [drm_kms_helper]
 985 Nov 27 15:56:22 lux kernel: Call trace:
 986 Nov 27 15:56:22 lux kernel:  __switch_to+0xf8/0x150
 987 Nov 27 15:56:22 lux kernel:  __schedule+0x2b8/0x790
 988 Nov 27 15:56:22 lux kernel:  schedule+0x64/0x140
 989 Nov 27 15:56:22 lux kernel:  virtio_gpu_queue_ctrl_sgs+0x13c/0x364 [virtio_gpu]
 990 Nov 27 15:56:22 lux kernel:  virtio_gpu_queue_fenced_ctrl_buffer+0x108/0x21c [virtio_gpu]
 991 Nov 27 15:56:22 lux kernel:  virtio_gpu_cmd_set_scanout+0x8c/0xb0 [virtio_gpu]
 992 Nov 27 15:56:22 lux kernel:  virtio_gpu_primary_plane_update+0x1ac/0x444 [virtio_gpu]
 993 Nov 27 15:56:22 lux kernel:  drm_atomic_helper_commit_planes+0x118/0x2e0 [drm_kms_helper]
 994 Nov 27 15:56:22 lux kernel:  drm_atomic_helper_commit_tail+0x58/0xa4 [drm_kms_helper]
 995 Nov 27 15:56:22 lux kernel:  commit_tail+0x1c4/0x220 [drm_kms_helper]
 996 Nov 27 15:56:22 lux kernel:  commit_work+0x1c/0x30 [drm_kms_helper]
 997 Nov 27 15:56:22 lux kernel:  process_one_work+0x210/0x4f0
 998 Nov 27 15:56:22 lux kernel:  worker_thread+0x170/0x5a0
 999 Nov 27 15:56:22 lux kernel:  kthread+0x110/0x114
1000 Nov 27 15:56:22 lux kernel:  ret_from_fork+0x10/0x20

Is there a known jammy LTS build that works? I disable all virtio gpu stuff and set the trackpad speed to slowest level but still see a freeze at least once daily.

dhammika avatar Nov 29 '23 18:11 dhammika

I think we really need to reproduce this in plain qemu and take it to the bugtracker there to have a better chance of getting this resolved.

stuaxo avatar Nov 30 '23 13:11 stuaxo

@dhammika how do you get this more detailed trace? I either don't have the same issue or I'm not using the right settings to get all the info.

stuaxo avatar Dec 06 '23 11:12 stuaxo

@dhammika how do you get this more detailed trace? I either don't have the same issue or I'm not using the right settings to get all the info.

Apologies missed this msg, That was from journalctl -b -1 -e I also tried UTM upgrade with 4.4.5 build but still hit this issue. Otherwise it's working great, almost worky :)

dhammika avatar Jan 14 '24 22:01 dhammika

Thanks, I found that in the end - it's good you documented here. At the moment, I'm trying to work out if I should upgrade to 23.10, I should probably verify it works with a separate small distro.

The latest thing I've noticed is that Gtk apps, seem to have short freezes from time to time.

stuaxo avatar Jan 16 '24 00:01 stuaxo

Great! I have to stay in LTS builds, so 23.10 isn't an option. Guest VM is fine, it's just a UX freeze. Not sure though if it's MacOS virtualization/qemu issue. It was working fine for a while before Sanoma OS upgrade. I don't have much insights there. UTM dbg log shows it's all ok on the VM host.

dhammika avatar Jan 17 '24 17:01 dhammika