riscv: SIGBUS when running `test_simd1` in RVV 1.0 CPU
crashing in the first SIMD test with:
(gdb) x/16i $pc-32
0x3ff7fd7ecc: addi t1,t1,1
0x3ff7fd7ed0: slli t1,t1,0xc
0x3ff7fd7ed4: xori t1,t1,-320
0x3ff7fd7ed8: vse32.v v29,(t1)
0x3ff7fd7edc: addi a0,s0,-1001
0x3ff7fd7ee0: addi a1,s0,1001
0x3ff7fd7ee4: vsetivli t1,4,e32,m1,tu,mu
0x3ff7fd7ee8: addi t1,a0,1211
=> 0x3ff7fd7eec: vle32.v v2,(t1)
0x3ff7fd7ef0: vsetivli t1,4,e32,m1,tu,mu
0x3ff7fd7ef4: addi t1,a1,-771
0x3ff7fd7ef8: vse32.v v2,(t1)
0x3ff7fd7efc: li a0,32
0x3ff7fd7f00: li a1,36
0x3ff7fd7f04: vsetivli t1,2,e64,m1,tu,mu
0x3ff7fd7f08: slli t1,a0,0x3
(gdb) info reg t1
t1 0x3fffffeed2 274877902546
with the following CPU:
model name : Spacemit(R) X60
isa : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintpause_zihpm_zfh_zfhmin_zca_zcd_zba_zbb_zbc_zbs_zkt_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_zvkt_sscofpmf_sstc_svinval_svnapot_svpbmt
mmu : sv39
mvendorid : 0x710
the bug is triggered by case 5, so probably SLJIT_SIMD_MEM_ALIGNED_16 might not be supported, at least in this CPU:
https://github.com/zherczeg/sljit/blob/f6326087b3404efb07c6d3deed97b3c3b8098c0c/test_src/sljitTestSimd.h#L142-L147
the documentation for RVV mentions:
Implementations are allowed to raise a misaligned address exception on whole register loads and stores if the base address is not naturally aligned to the larger of the size of the encoded EEW in bytes (EEW/8) or the implementation’s smallest supported SEW size in bytes (SEWMIN/8).
Note Allowing misaligned exceptions to be raised based on non-alignment to the encoded EEW simplifies the implementation of these instructions. Some subset implementations might not support smaller SEW widths, so are allowed to report misaligned exceptions for the smallest supported SEW even if larger than encoded EEW. An extreme non-standard implementation might have SEWMIN>XLEN for example. Software environments can mandate the minimum alignment requirements to support an ABI. and the system is running Debian (but with a vendor kernel) so it might be possible that other misaligned load exceptions are being masked (or could be masked)
Interesting limitations. I have never tried to code on real hardware, I have no access to them. The compiler can return with SLJIT_UNSUPPORTED if these limitations can be detected somehow.
FWIW, gcc 14.2.0 also triggers a Bus error, but next version seems to default to NOT allow misaligned loads unless it was requested.
I remember riscv was proud that misaligned memory support is always available.
Anyway, the test can be enhanced with more support[i] tests, and riscv could return with SLJIT_UNSUPPORTED for the unsupported forms, if this can be tested somehow.
I remember riscv was proud that misaligned memory support is always available.
Not sure if I would qualify it as "proud", but the Zicclsm extension that is mandatory for RVA20U64 profile CPUs said:
Even though mandated, misaligned loads and stores might execute extremely slowly. Standard software distributions should assume their existence only for correctness, not for performance.
And at least for Linux, the hwprobe RISCV syscall (which might be useful to allow probing also for the vector case) exports the performance characteristics of misaligned access to user space (see RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF).
my suggestion was to follow gcc in disabling this by default, but what we are missing is a way to enable it back at runtime (reenabling it at build time by leveraging gcc's notion of what the target can support would be nice but it is not something that can be exported now, unlike the other options we used; of course we could add an SLJIT specific flag to do so instead but that doesn't seem flexible enough IMHO)
And at least for Linux, the hwprobe RISCV syscall (which might be useful to allow probing also for the vector case) exports the performance characteristics of misaligned access to user space (see RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF).
note that you'll want RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF instead for this particular bit of code; a lot of the crappy early hardware does manage to work for misaligned scalars and only trip over on misaligned vectors (which is why the two hwprobes are now distinct).
note also that the distros' recent "you must be at least rva23 to enjoy this ride"[1] implicitly requires that misaligned scalar and vector accesses are supported. so if you're not interested in supporting lesser hardware than they are...
- which, yes, does mean basically "just qemu": https://www.phoronix.com/news/Ubuntu-25.10-RISC-V-QEMU