Kernel panic on import when using kernel with CONFIG_X86_NATIVE_CPU on Zen4
System information
| Type | Version/Name |
|---|---|
| Distribution Name | Gentoo Amd64 |
| Distribution Version | --- |
| Kernel Version | 6.16.9 + 6.17.[8,10] |
| Architecture | X86_64 |
| OpenZFS Version | 2.3.[4,5] |
Describe the problem you're observing
Kernel panic during early boot on rpool import
Describe how to reproduce the problem
Boot system with kernel build with CONFIG_X86_NATIVE_CPU=y, system boots flawless when kernel is buil with CONFIG_X86_NATIVE_CPU not set. Likely all 6.16 and 6.17 kernel versions are affected.
Include any warning/errors/backtraces from the system logs
none as panic during initrd, but mostly due to null pointer below zio_data_buf_free, around abd_iterate_func2 or abd_iterate_page_func
Interesting remark I have two Zen5 systems (P16s gen4 HX pro 370) which work fine with 6.17.10 and CONFIG_X86_NATIVE_CPU=y with zfs root using zfs 2.3.5
@manschwetus could you get a screenshot of the panic? It'd be really useful!
What compiler & toolchain versions are being used here. Are they different on the Zen5 systems?
CONFIG_X86_NATIVE_CPU is only used to set compiler args in the kernel:
ifdef CONFIG_X86_NATIVE_CPU
KBUILD_CFLAGS += -march=native
else
KBUILD_CFLAGS += -march=x86-64 -mtune=generic
endif
So it'll be some subtle codegen thing of course. I wonder what the practical difference is between Zen 4 & 5 wrt to codegen, and if we're doing overriding something in a very weird way?
If we can get the trace, next step might be to disassemble some code.
Need to check but the systems are very similar. Both use
- x86_64-pc-linux-gnu-14 => 14.3.1_p20250801
- coreutils 9.8-r1
- binutils x86_64-pc-linux-gnu-2.45
I made some video recordings, here some screen shots
I'll look if I have more, otherwise on next chance I'll retry and make sure to document if problem is still reproducible.