read_feature_regs crashed on some ARM hardware
read_feature_regs in proc_init_arch crashes at the following MRS on some ARM hardware:
https://github.com/DynamoRIO/dynamorio/blob/304faf96845b06b4889e1c2ce1b49c3ba9b22462/core/arch/aarch64/proc.c#L61
@AssadHashmi Is this expected to happen on some processors? How do we handle it?
@AssadHashmi Is this expected to happen on some processors? How do we handle it?
Apologies for the late reply, I have been on vacation.
This is not expected to happen. MRS instructions which access _EL1 registers are handled by the kernel which provides the contents to user-space.
Does the hardware use a stock Linux kernel? We have been running on AWS 5.15 instances, e.g. 5.15.0-1040-aws 5.15.0-1043-aws 5.15.0-1039-aws
Is the crash related to kernel privilege levels/permissions or is it a SIGILL?
On investigating this further, I found that the crash is actually not at the mrs instruction, but at the rdvl instruction added by this inline asm
https://github.com/DynamoRIO/dynamorio/blob/a2f8bcc95def89de92141e627e53c796b1377089/core/arch/aarch64/proc.c#L128
I found that the crash happens only when certain compiler optimizations are enabled. For some reason, they cause incorrect asm to be generated for proc_init_arch (the mrs x_, id_aa64pfr0_el1 and the rdvl inline-asm are placed next to each other in the same basic block, which is incorrect because the rdvl is supposed to be gated by the if-condition).
This may be due to the fact that our inline asm uses encoded instruction bytes, and maybe the compiler doesn't recognize these bytes (because we're not compiling with -march=armv8-a+sve), but I don't know for sure. In any case, adding volatile to the asm block prevents the compiler from doing funny transformations, and fixes the crash. I'll send a PR.