KVM-Opencore icon indicating copy to clipboard operation
KVM-Opencore copied to clipboard

macOS kernel panic during boot due to non-monotonic TSC

Open sej7278 opened this issue 4 years ago • 50 comments

Hi, this is possibly not even an opencore issue, but does anything jump out at you as being an obvious problem?

Got a VM running 11.6.1 with your oc 0.7.4 install on my Xeon E5-2650v2, but it simply will not take the Monterey update (also didn't do one of the beta's i tried or a fresh install) it starts the upgrade but after a couple of reboots seems to panic and get stuck at mach reboot.

It gets past the time-out here:

1_timeout

But then panics here:

2_panic

And finally hangs completely here trying to reboot i guess:

3_mach_reboot

My config.plist is basically the same as yours but with some smbios stuff added by opencore-configurator (serials etc.) oc-validator had nothing bad to say about it.

libvirt xml is: libvirt.txt

Host is Debian Sid, qemu 6.1.0, libvirt 7.6.0, kernel 5.14.12

sej7278 avatar Oct 26 '21 18:10 sej7278

High IO on the host during the upgrade might be triggering a kernel panic due to timeouts in the guest. Try adding the parameters I identified here to your boot-args:

https://www.nicksherlock.com/2020/08/solving-macos-vm-kernel-panics-on-heavily-loaded-proxmox-qemu-kvm-servers/

tlbto_us=0 vti=9

thenickdude avatar Oct 26 '21 21:10 thenickdude

ok thanks, will try in the morning and report back.

sej7278 avatar Oct 26 '21 22:10 sej7278

ok tried that and thought it was going to make a difference as cpu usage went through the roof (like 100% on half my 16 cores!) but it still ended up at the mach reboot crash.

it seems to reboot a lot at this disk crypto stage (not crash, just reboot back to opencore chooser):

disk

i might try reducing the vcpu's to 4, maybe its a thread timeout/race or something....?

sej7278 avatar Oct 27 '21 09:10 sej7278

That's curious, if CPU usage increased it seems like a kernel thread is spinning in an infinite loop. tlbto_us=0 causes failures in a core to respond in a timely fashion to a TLB flush to be completely ignored instead of triggering a panic.

I'll check out your VM config

thenickdude avatar Oct 28 '21 09:10 thenickdude

Where did you get your OVMF image by the way? You might try switching to one provided by your distro just in case

thenickdude avatar Oct 28 '21 09:10 thenickdude

I think the ovmf was from osx-kvm, I'll try the Debian one, might also try a fresh install again instead of an upgrade or maybe try without gpu passthrough.

Reducing the core count made no difference nor did switching to virtio-net from vmxnet3 (didn't realize that worked).

It seems to stall at various points for a few minutes then reboot, but once it gets to Mach reboot it's definitely dead.

sej7278 avatar Oct 28 '21 09:10 sej7278

I've never observed behaviour like that so I'm a bit in the dark on what might cause it, sorry!

If you've got any passthrough devices defined, does it boot if they're removed?

thenickdude avatar Oct 28 '21 09:10 thenickdude

OVMF_CODE.fd or OVMF_CODE_4M.fd from debian seem to reduce cpu usage to almost nothing, also reduced the reboots but still ends up at MACH Reboot i'll try removing the gpu next as that seemed to work for someone on reddit

sej7278 avatar Oct 28 '21 10:10 sej7278

I definitely recommend removing passthrough GPUs during upgrades because the repeated restarts asks a lot of the shitty AMD GPU drivers.

thenickdude avatar Oct 28 '21 10:10 thenickdude

i'm going to close this and give up, as a completely fresh install with fresh oc15 and ovmf and no passthrough doesn't even get as far as disk utility, so i'm assuming monterey is a lot more fussy about hardware or software (as bigsur runs fine on the same vm).

thanks for your time.

sej7278 avatar Oct 28 '21 11:10 sej7278

The only real hardware the guest can even see is your CPU, which is perfectly compatible (same generation as mine)

thenickdude avatar Oct 28 '21 12:10 thenickdude

Oh your CPU argument is missing +hypervisor. The macOS kernel gives all sorts of timing slack to you if it knows it's running in a VM, which requires +hypervisor

thenickdude avatar Oct 28 '21 12:10 thenickdude

i tried adding that to my existing flags and i tried changing completely to:

-cpu host,kvm=on,vendor=GenuineIntel,+kvm_pv_unhalt,+kvm_pv_eoi,+hypervisor,+invtsc

but it didn't make any difference. this is really confusing as i've never really had a problem before (other than bigsur which just needed a new opencore) but monterey seems unsurmountable to me - i can't even get to the installer let alone an upgrade!

sej7278 avatar Oct 28 '21 12:10 sej7278

@thenickdude do you have a monterey vm running with this EFI?

kuasha420 avatar Oct 28 '21 20:10 kuasha420

Absolutely, I installed using a recovery, a full installer, and upgraded from Big Sur. Didn't have any problems with any of those scenarios.

Passthrough of RX580 successful.

QEMU 6.0.0-4, edk2-stable202108, pc-q35-6.0

thenickdude avatar Oct 28 '21 21:10 thenickdude

Here's my QEMU commandline for my VM with passthrough:

/usr/bin/kvm \
  -no-shutdown \
  -smbios 'type=1,uuid=...' \
  -drive 'if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' \
  -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=131072,file=/dev/zvol/rpool/vms/vm-110-disk-1' \
  -smp '16,sockets=1,cores=16,maxcpus=16' \
  -nodefaults \
  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
  -vga none \
  -nographic \
  -cpu 'Penryn,enforce,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt,vendor=GenuineIntel' \
  -m 16384 \
  -object 'memory-backend-file,id=ram-node0,size=16384M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
  -numa 'node,nodeid=0,cpus=0-15,memdev=ram-node0' \
  -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg \
  -device 'vfio-pci,host=0000:03:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' \
  -device 'vfio-pci,host=0000:03:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' \
  -device 'vfio-pci,host=0000:00:1a.0,id=hostpci1,bus=ich9-pcie-port-2,addr=0x0' \
  -device 'vfio-pci,host=0000:00:1d.0,id=hostpci2,bus=ich9-pcie-port-3,addr=0x0' \
  -drive 'file=/dev/zvol/rpool/vms/vm-111-disk-0,if=none,id=drive-virtio0,cache=unsafe,discard=on,format=raw,aio=io_uring,detect-zeroes=unmap' \
  -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' \
  -netdev 'type=tap,id=net0,ifname=tap110i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
  -device 'virtio-net-pci,mac=...,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
  -machine 'type=q35+pve0' \
  -device 'isa-applesmc,osk=...' \
  -smbios 'type=2' \
  -cpu 'host,kvm=on,vendor=GenuineIntel,+kvm_pv_unhalt,+kvm_pv_eoi,+hypervisor,+invtsc'

(Duplicate args are due to Proxmox config restrictions)

/usr/share/qemu-server/pve-q35-4.0.cfg is:

[device "ehci"]
  driver = "ich9-usb-ehci1"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.7"

[device "uhci-1"]
  driver = "ich9-usb-uhci1"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.0"
  masterbus = "ehci.0"
  firstport = "0"

[device "uhci-2"]
  driver = "ich9-usb-uhci2"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.1"
  masterbus = "ehci.0"
  firstport = "2"

[device "uhci-3"]
  driver = "ich9-usb-uhci3"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.2"
  masterbus = "ehci.0"
  firstport = "4"

[device "ehci-2"]
  driver = "ich9-usb-ehci2"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.7"

[device "uhci-4"]
  driver = "ich9-usb-uhci4"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.0"
  masterbus = "ehci-2.0"
  firstport = "0"

[device "uhci-5"]
  driver = "ich9-usb-uhci5"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.1"
  masterbus = "ehci-2.0"
  firstport = "2"

[device "uhci-6"]
  driver = "ich9-usb-uhci6"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.2"
  masterbus = "ehci-2.0"
  firstport = "4"

[device "audio0"]
  driver = "ich9-intel-hda"
  bus = "pcie.0"
  addr = "1b.0"

[device "ich9-pcie-port-1"]
  driver = "pcie-root-port"
  x-speed = "16"
  x-width = "32"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.0"
  port = "1"
  chassis = "1"

[device "ich9-pcie-port-2"]
  driver = "pcie-root-port"
  x-speed = "16"
  x-width = "32"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.1"
  port = "2"
  chassis = "2"

[device "ich9-pcie-port-3"]
  driver = "pcie-root-port"
  x-speed = "16"
  x-width = "32"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.2"
  port = "3"
  chassis = "3"

[device "ich9-pcie-port-4"]
  driver = "pcie-root-port"
  x-speed = "16"
  x-width = "32"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.3"
  port = "4"
  chassis = "4"

[device "pcidmi"]
  driver = "i82801b11-bridge"
  bus = "pcie.0"
  addr = "1e.0"

[device "pci.0"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "1.0"
  chassis_nr = "1"

[device "pci.1"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "2.0"
  chassis_nr = "2"

[device "pci.2"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "3.0"
  chassis_nr = "3"

[device "pci.3"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "4.0"
  chassis_nr = "4"

thenickdude avatar Oct 28 '21 21:10 thenickdude

@thenickdude I just did the upgrade from 12.6.1 to 12.0.1 and everything went great! Here's my configuration if anyone is interested.

LibVirt Config

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>macOS</name>
  <uuid>2aca0dd6-cec9-4717-9ab2-0b7b13d111c3</uuid>
  <title>macOS</title>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <vcpu placement='static'>8</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='2' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='3' enabled='yes' hotpluggable='yes' order='4'/>
    <vcpu id='4' enabled='yes' hotpluggable='yes' order='5'/>
    <vcpu id='5' enabled='yes' hotpluggable='yes' order='6'/>
    <vcpu id='6' enabled='yes' hotpluggable='yes' order='7'/>
    <vcpu id='7' enabled='yes' hotpluggable='yes' order='8'/>
  </vcpus>
  <cputune>
    <vcpupin vcpu='0' cpuset='2'/>
    <vcpupin vcpu='1' cpuset='8'/>
    <vcpupin vcpu='2' cpuset='3'/>
    <vcpupin vcpu='3' cpuset='9'/>
    <vcpupin vcpu='4' cpuset='4'/>
    <vcpupin vcpu='5' cpuset='10'/>
    <vcpupin vcpu='6' cpuset='5'/>
    <vcpupin vcpu='7' cpuset='11'/>
    <emulatorpin cpuset='0-1,6-7'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-q35-6.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/edk2-ovmf/x64/OVMF_CODE.fd</loader>
    <nvram>/home/kuasha/OSX-KVM/OVMF_VARS-1024x768.fd</nvram>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='on'>
    <topology sockets='1' dies='1' cores='4' threads='2'/>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/home/kuasha/macOS/OpenCore-v15.img'/>
      <target dev='sda' bus='sata'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x9'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0xa'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0xb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0xc'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0xd'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x5'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0xe'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x6'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0xf'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='qemu-xhci'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:8e:e2:66'/>
      <source bridge='br0'/>
      <target dev='tap0'/>
      <model type='vmxnet3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <input type='mouse' bus='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0e' function='0x0'/>
    </input>
    <input type='keyboard' bus='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0f' function='0x0'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <sound model='ich9'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </sound>
    <audio id='1' type='none'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='no'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='no'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
  <qemu:commandline>
    <qemu:arg value='-device'/>
    <qemu:arg value='isa-applesmc,osk=ourhardworkbythesewordsguardedpleasedontsteal(c)AppleComputerInc'/>
    <qemu:arg value='-cpu'/>
    <qemu:arg value='host,vendor=GenuineIntel,+hypervisor,+invtsc,kvm=on,+fma,+avx,+avx2,+aes,+ssse3,+sse4_2,+popcnt,+sse4a,+bmi1,+bmi2'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='hda-micro,audiodev=hda'/>
    <qemu:arg value='-audiodev'/>
    <qemu:arg value='pa,id=hda,server=unix:/run/user/1000/pulse/native'/>
    <qemu:arg value='-object'/>
    <qemu:arg value='input-linux,id=mouse1,evdev=/dev/input/by-id/usb-Logitech_Gaming_Mouse_G502_1263366D3336-event-mouse'/>
    <qemu:arg value='-object'/>
    <qemu:arg value='input-linux,id=kbd1,evdev=/dev/input/by-id/ckb-Corsair_STRAFE_RGB_Gaming_Keyboard_vKB_-event,grab_all=on,repeat=on'/>
  </qemu:commandline>
</domain>

System Information

                     ./o.                  kuasha@kuasha-z490ud 
                   ./sssso-                -------------------- 
                 `:osssssss+-              OS: EndeavourOS Linux x86_64 
               `:+sssssssssso/.            Host: Z490 UD 
             `-/ossssssssssssso/.          Kernel: 5.14.14-arch1-1 
           `-/+sssssssssssssssso+:`        Uptime: 11 hours, 9 mins 
         `-:/+sssssssssssssssssso+/.       Packages: 1208 (pacman) 
       `.://osssssssssssssssssssso++-      Shell: zsh 5.8 
      .://+ssssssssssssssssssssssso++:     Resolution: 1720x1440 
    .:///ossssssssssssssssssssssssso++:    DE: Plasma 5.23.1 
  `:////ssssssssssssssssssssssssssso+++.   WM: KWin 
`-////+ssssssssssssssssssssssssssso++++-   Theme: Breeze Dark [Plasma], Breeze [GTK2] 
 `..-+oosssssssssssssssssssssssso+++++/`   Icons: [Plasma], breeze-dark [GTK2/3] 
   ./++++++++++++++++++++++++++++++/:.     Terminal: konsole 
  `:::::::::::::::::::::::::------``       Terminal Font: MesloLGS NF 20 
                                           CPU: Intel i5-10600 (12) @ 4.800GHz 
                                           GPU: Intel CometLake-S GT2 [UHD Graphics 630] 
                                           GPU: AMD ATI Radeon RX 470/480/570/570X/580/580X/590 
                                           Memory: 14148MiB / 31952MiB 

Screen Shot 2021-10-29 at 3 42 13 PM

kuasha420 avatar Oct 29 '21 09:10 kuasha420

@kuasha420 could you list the libvirt/qemu version you're using as i'm wondering if its a qemu 6.1 issue as @thenickdude is using 6.0 and it looks like you are too

sej7278 avatar Oct 29 '21 10:10 sej7278

@sej7278

pacman -Q libvirt qemu edk2-ovmf linux
libvirt 1:7.8.0-1
qemu 6.1.0-5
edk2-ovmf 202108-1
linux 5.14.14.arch1-1

I am using qemu 6.1 as well but the machine type should be 6.0.

machine type 6.1 has issues.

kuasha420 avatar Oct 29 '21 10:10 kuasha420

just compiled QEMU emulator version 6.1.50 (v6.1.0-1735-gc52d69e7) and that barely even starts macos, changing the machine type doesn't seem to make any difference to me, also tried your commandline. i'm lost, i wonder if its because i'm using virtio-blk instead of sata

sej7278 avatar Oct 29 '21 11:10 sej7278

@sej7278 I've also experienced the maxed out 100% CPU usage check. It seems to be related to macOS trying to do a full APFS fsck or something check after a busted shutdown. I think it's just Monterey.

sickcodes avatar Oct 29 '21 17:10 sickcodes

@sickcodes yes it's definitely doing that but I think I'm making it past that stage

sej7278 avatar Oct 29 '21 18:10 sej7278

i think i finally managed to get a screenshot before it crashes, if this makes any sense:

Screenshot from 2021-10-30 00-41-40

sej7278 avatar Oct 29 '21 23:10 sej7278

So that seems to be panicing due to non-monotonic time (clock going backwards).

I wonder if you're getting a warning at VM launch time that "invtsc" isn't actually available on your system. Try removing that from your CPU args if it's currently there.

Can you post the VM command/config you're currently using and also the output of this on the host:

cat /proc/cpuinfo

(You only need to paste the output from a single one of the cores)

thenickdude avatar Oct 30 '21 02:10 thenickdude

This user has the same panic on bare-metal:

https://www.reddit.com/r/hackintosh/comments/qhjnly/random_kernel_panics_on_x79/

OpenCore bug tracker:

https://github.com/acidanthera/bugtracker/issues/1676

Although in the case of QEMU I think it's QEMU's job to present a consistent timestamp counter, so in theory TSCAdjustReset shouldn't be needed...

thenickdude avatar Oct 30 '21 03:10 thenickdude

Also, your host isn't going to sleep during the install because you aren't moving the mouse to keep it awake, is it?

thenickdude avatar Oct 30 '21 04:10 thenickdude

ah i had a problem with kvm-pit with catalina hard crashing the host, the fix was to remove this lot, but i don't have that in my monterey config, i wonder what the defaults are:

<clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
</clock>

cpuinfo:

processor	: 31
vendor_id	: GenuineIntel
cpu family	: 6
model		: 62
model name	: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping	: 4
microcode	: 0x42e
cpu MHz		: 1200.000
cache size	: 20480 KB
physical id	: 1
siblings	: 16
core id		: 7
cpu cores	: 8
apicid		: 47
initial apicid	: 47
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
vmx flags	: vnmi preemption_timer posted_intr invvpid ept_x_only ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips	: 5190.81
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

host is a desktop so not going to sleep.

when i got that screenshot i was running OpenCore-boot.sh instead of virt-manager.

i do get this in dmesg, don't know how to fix it though:

dmesg |grep -i kvm
[  215.019676] kvm: SMP vm created on host with unstable TSC; guest TSC will not be reliable

cat /sys/devices/system/clocksource/clocksource0/available_clocksource 
hpet acpi_pm 

cat /sys/devices/system/clocksource/clocksource0/current_clocksource 
hpet

removing +invtsc isn't making any difference

ps auxw|grep qemu
simon     499511 99.5 25.5 17803184 16838992 ?   SLl  10:19   0:28 /usr/bin/qemu-system-x86_64 -name guest=monterey,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/home/simon/.config/libvirt/qemu/lib/domain-4-monterey/master-key.aes"} -blockdev {"driver":"file","filename":"/data5/kvm/macos/monterey/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"} -blockdev {"driver":"file","filename":"/data5/kvm/macos/monterey/OVMF_VARS-1024x768.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"} -machine pc-q35-6.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format,memory-backend=pc.ram -cpu host,migratable=on -m 16384 -object {"qom-type":"memory-backend-ram","id":"pc.ram","size":17179869184} -overcommit mem-lock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 5068102b-057f-43a2-8e4f-f6ded11ffbac -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=28,server=on,wait=off -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -blockdev {"driver":"file","filename":"/data5/kvm/macos/monterey/OpenCore-v15.img","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"raw","file":"libvirt-2-storage"} -device ide-hd,bus=ide.0,drive=libvirt-2-format,id=sata0-0-0,bootindex=1 -blockdev {"driver":"file","filename":"/data5/kvm/macos/monterey/monterey.qcow2","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":null} -device virtio-blk-pci,bus=pci.1,addr=0x0,drive=libvirt-1-format,id=virtio-disk0,write-cache=on -netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=30 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:6b:84:02,bus=pci.3,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -audiodev id=audio1,driver=none -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0,audiodev=audio1 -device vfio-pci,host=0000:02:00.0,id=hostdev0,bus=pci.2,multifunction=on,addr=0x0,rombar=1 -device vfio-pci,host=0000:02:00.1,id=hostdev1,bus=pci.2,addr=0x0.0x1,rombar=1 -device usb-host,hostdevice=/dev/bus/usb/004/003,id=hostdev2,bus=usb.0,port=1 -cpu host,kvm=on,vendor=GenuineIntel,+kvm_pv_unhalt,+kvm_pv_eoi,+hypervisor -device isa-applesmc,osk=ourhardworkbythesewordsguardedpleasedontsteal(c)AppleComputerInc -smbios type=2 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on

sej7278 avatar Oct 30 '21 09:10 sej7278

Okay, I'm pretty sure that's your problem. On my system the clocksource is set to tsc, but on yours it doesn't even get offered as an option.

What does this return "dmesg | grep -i -e tsc -e clocksource"? Mine reports:

[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 3400.180 MHz processor
[    0.164259] TSC deadline timer available
[    0.164329] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.374869] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns
[    0.394888] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x3102f8d1124, max_idle_ns: 440795299789 ns
[    0.614927] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.929463] clocksource: Switched to clocksource tsc-early
[    0.946617] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    2.711010] tsc: Refined TSC clocksource calibration: 3399.981 MHz
[    2.724715] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x31023c93017, max_idle_ns: 440795261805 ns
[    2.769229] clocksource: Switched to clocksource tsc

I think on my system there was an option buried in the host UEFI settings for TSC synchronisation between sockets. If yours has that too, make sure it's turned on, because otherwise it might cause the TSC to be rejected.

I guess since you only have two clocksources to choose from you could try switching to the other and see if things improve:

echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource 

Finally, check for a BIOS update for your motherboard, since this is the sort of thing they fix there.

thenickdude avatar Oct 30 '21 09:10 thenickdude

Also it sounds like recent Linux kernels 5.13, 5.14 have made timing changes that cause it to disable TSC in more situations:

https://www.phoronix.com/forums/forum/software/general-linux-open-source/1283799-linux-5-15-rc5-x86-changes-aim-to-fix-yet-another-hardware-trainwreck

If you can try 5.12 and see if the tsc clocksource comes back that would be interesting.

thenickdude avatar Oct 30 '21 09:10 thenickdude

i'll have a look in my bios tomorrow (don't a massive backup right now) but i'm in bios mode not uefi and it is the latest (dell t5610 doesn't get updated very often!). i do recall some time settings but think it was just utc.

# dmesg | grep -i -e tsc -e clocksource
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 2593.895 MHz processor
[    0.023471] TSC deadline timer available
[    0.023542] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.097315] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns
[    0.117335] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x2563b4fbe6b, max_idle_ns: 440795330438 ns
[    0.149339] TSC synchronization [CPU#0 -> CPU#1]:
[    0.149339] Measured 1430554 cycles TSC warp between CPUs, turning off TSC clock.
[    0.149339] tsc: Marking TSC unstable due to check_tsc_sync_source failed
[    0.357884] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.693344] clocksource: Switched to clocksource hpet
[    0.711473] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[  215.019676] kvm: SMP vm created on host with unstable TSC; guest TSC will not be reliable

i've had this TSC issue since 5.10 kernel as i recall, its not new to 5.14

also noticed this: https://www.dell.com/community/Precision-Fixed-Workstations/TSC-warp-on-T5610-running-Linux/td-p/7820763

i might try cpu pinning to only use a single socket, but that usually kills performance (oddly enough!)

sej7278 avatar Oct 30 '21 22:10 sej7278