zig
zig copied to clipboard
aarch64 feature identification failure (mrs illigal instruction)
Zig Version
0.14.0-dev.1063+1018cdc0a
Steps to Reproduce and Observed Behavior
Run zig's aarch64 build on Samsung note 9 via termux and attempt to execute the zig targets command.
This results in an illegal instruction some other commands also do so but I think its the same root cause.
With the help of the discord we think we identified the root issue https://discord.com/channels/605571803288698900/1273695605637644399 But the following is a summary of the findings.
As part of identification of available features on aarch64 platforms (getAArch64CpuFeature) the mrs instruction is used to read the feature registers. On the Qualcomm snapdragon 845 in the Samsung galaxy note 9 this appears to be an illegal instruction.
It seems like we could do something similar to what is being done for non aarch64 cases as a fall back, potentially putting some additional condition around the switch case on line 394 of lib/std/zig/system/linux.zig that allows defaulting to /proc/cpuinfo in the event that mrs is unavailable
I also tried looking into arm.zig to see if I could figure out what those values should be but I wasn't able to locate all the flags listed by /proc/cpuinfo.
For reference here is the contents of my /proc/cpuinfo and my lscpu output. cpuinfo.txt lscpu.txt
Expected Behavior
Output list of available platforms
Does zig care about all the feature flags or is there a subset that actually gets used? For instance i didn't see the asimd flag in the arm.zig file
It would be nice to know exactly which of these registers are problematic on that machine. It would be fairly surprising if the mrs instruction is just blocked wholesale.
The failed instruction is mrs x9, midr_el1 so definitely at least MIDR_EL1 but since that's the first one it doesn't rule out mrs not working. Is there a good way for me to test the other ones? I'm not too good at actually writing zig yet so I might need a minimum example or something.
Well, you can just try editing that array of registers, replacing calls with 0 until you narrow down which work and which don't. Armed with that info, I think it'd be easier to determine the right way forward.
I changed it in the source before rebuilding but that doesn't seem to have properly disabled the midr check
I figured out that part, I've tried 4 different registers, none of them work. I really suspect that the mrs instruction just doesn't work. It's kind of a lot of steps for each register I turn off. I checked individually the first 2nd 3rd and 8th.
Now that I take a closer look, I actually have no idea how this ever worked? EL0 is user space and EL1 is kernel mode; of course these registers will not be readable.
The problem was introduced with https://github.com/ziglang/zig/pull/19595.
@nsluhrs what is your kernel version?
I was just checking that 4.9.186-22990479 unfortunately, looks like that feature wasn't allowed until 4.10 Could we have it check the kernel version as part of the process and would we be willing to support such an old kernel version? Alternatively, a way to specify available features manually might be an interesting option
Formally, we only support Linux 4.19+. However, in general, patches to support older systems are welcome, as long as they have minimal maintenance cost and have no impact on code compiled for newer (supported) systems.
Anyway, can you not just upgrade the device to a newer kernel version? Linux 4.10 was released in 2017, while the device was released in 2018, and Linux 4.19 also in 2018.
The failed instruction is
mrs x9, midr_el1
Also see previous discussion in https://github.com/termux/termux-packages/issues/20783#issuecomment-2211719239
This seems reasonable, I'll go ahead and close the issue!