KFENCE creates L1TF-vulnerable PTEs
Qubes OS release
R4.1
Brief summary
PV guests with kernel-latest (6.0.8) do not boot. This was discovered when trying to track down a PV-only memory mangement bug in i915.
Steps to reproduce
Set the kernel of a PV guest to 6.0.8 and try to boot it.
Expected behavior
Boot succeeds.
Actual behavior
Boot fails.
What's exactly the issue? It works for me.
All qrexec calls fail saying that they cannot connect to the qrexec agent. Older kernel versions boot fine.
Still, works for me...
Are you using Xen with PAT patch?
I was, IIRC
Ok, with patched Xen I can confirm the failure. But since that's not PV ABI compliant, it isn't very surprising. I cannot reproduce on unpatched Xen.
(XEN) d31 L1TF-vulnerable L1e 8010000013600066 - Crashing
(XEN) domain_crash called from ./xen/include/asm/shadow.h:206
(XEN) Domain 31 (vcpu#0) crashed on cpu#1:
(XEN) ----[ Xen-4.14.5 x86_64 debug=n Not tainted ]----
(XEN) CPU: 1
(XEN) RIP: e033:[<ffffffff81e2102a>]
(XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (d31v0)
(XEN) rax: 0000000000000001 rbx: 0000000060e41000 rcx: ffffffff81e2102a
(XEN) rdx: 0000000000000000 rsi: 0000000000000001 rdi: ffffffff82c03d80
(XEN) rbp: 8010000013600066 rsp: ffffffff82c03d68 r8: ffff888018f88000
(XEN) r9: 0000000000000000 r10: 0000000000007ff0 r11: 0000000000000246
(XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000
(XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 0000000000362660
(XEN) cr3: 000000001ea10000 cr2: ffffc900007cf000
(XEN) fsb: 0000000000000000 gsb: ffff888013e00000 gss: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
(XEN) Guest stack trace from rsp=ffffffff82c03d68:
(XEN) 657a697320746375 0000000000000000 ffffffff8102186c 0000000060e41000
(XEN) 8010000013600066 0000000000000000 0000000000000000 0000000000000001
(XEN) ffff888013600000 ffff888013600000 ffffffff8138d5a8 0000000100000000
(XEN) 0000000000000000 ffff888013800000 ffffffff82c03e48 ffffffff8138db6d
(XEN) 00000001811afcfe 0000000000000000 e3e967228118b114 ffffffff82c03e48
(XEN) 0000000000000000 0000000000000000 ffffffff83597436 815811de7fbc79b2
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) ffffffff82c03e60 ffffffff83597774 d0e2715eed3283af ffffffff82c03e98
(XEN) ffffffff8351bef7 ffffffff83672320 0000000000000000 0000000000000000
(XEN) 3a48e669e8d7abf8 ffffffff84400000 ffffffff82c03f48 ffffffff8352bd2e
(XEN) 0000000100000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 ffffffff835191df 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
stack trace with symbols
(early) [ 0.417527] RIP: e030:xen_hypercall_mmu_update+0x8/0x20
(early) [ 0.417534] Code: cc cc 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 <0f> 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
(early) [ 0.417537] RSP: e02b:ffffffff82c03d68 EFLAGS: 00000046
(early) [ 0.417540] RAX: 0000000000000001 RBX: 000000039f788000 RCX: ffffffff81e2502a
(early) [ 0.417543] RDX: 0000000000000000 RSI: 0000000080000001 RDI: ffffffff82c03d80
(early) [ 0.417546] RBP: 8010000013600066 R08: ffff888018f88000 R09: 0000000000000000
(early) [ 0.417548] R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
(early) [ 0.417550] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
(early) [ 0.417557] FS: 0000000000000000(0000) GS:ffff888013e00000(0000) knlGS:0000000000000000
(early) [ 0.417560] CS: 10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
(early) [ 0.417562] CR2: ffffc900007cf000 CR3: 0000000002c10000 CR4: 0000000000040660
(early) [ 0.417567] Call Trace:
(early) [ 0.417570] <TASK>
(early) [ 0.417573] ? __xen_set_pte+0xdc/0x210
(early) [ 0.417578] ? kfence_protect_page+0x68/0xd0
(early) [ 0.417582] ? kfence_init_pool+0x12d/0x280
(early) [ 0.417586] ? kfence_init_pool_early+0x4c/0x281
(early) [ 0.417591] ? kfence_init+0x3f/0xc4
(early) [ 0.417594] ? start_kernel+0x40d/0x6ef
(early) [ 0.417599] ? xen_start_kernel+0x5c4/0x5e9
(early) [ 0.417603] ? startup_xen+0x1f/0x1f
(early) [ 0.417607] </TASK>
Lowering to P: minor as the use of PV guests is discouraged for security reasons.
As per @andyhhp (in https://github.com/QubesOS/qubes-issues/issues/8593#issuecomment-2414978768):
@aronowski and/or @DemiMarie
I took this to [email protected] before realising it had been fully discussed in public. Could either of you first re-repro the issue, and then try the following patch. It's been Okay'd by the KFENCE folks already.
0001-x86-kfence-Avoid-writing-L1TF-vulnerable-PTEs.patch.txt
The problem is that KFENCE creates L1TF-vulnerable page table entries. Xen can’t allow this and crashes the guest. The above patch prevents KFENCE from creating such entries.