wasmer SinglePass: add support for riscv64 target

The PR adds the rudimentary support for RISC-V 64-bit target into the SinglePass compiler. ~Even though, the code still needs some polishing and various TODOs must be addressed, the compiler can pass all (104) the spec tests.~

TODOs:

[x] unwinding support
[x] FP status register - save/restore
[x] a general location_cmp + jmp_on_X needs polishing as RISC-V target does not provide a FLAGS register (#5750).
[x] spec/address.wast is flaky (sometimes trap is not properly triggered)
[x] basic testing - all compiler tests (including the WASI tests) are green now
[x] include RISCV-V in CI by running the tests in qemu

Minor improvements:

[ ] CLZ, CTZ and POPCOUNT have sub-optimal performance - could be implemented as builtins in the run-time (fn call overhead)
[ ] still some unimplemented entry points for the Machine trait - wipe unused entry points in the Trait
[ ] more intensive testing needed (including real hardware + valgrind)

Aug 29 '25 17:08 marxin

Nice work! Huge kudos on making this happen!

We've been testing this branch and here's some early discoveries:

target-lexicon used by wasmer don't have a default calling convention, so when we are doing cross-compilation, the code will always fail at those lines, by any means you feel like setting SystemV as the default calling convention for RISC-V?
We do notice the emitter does not cover all cases of the code structure, for example, we hit codegen errors in SLL / SRL generations(in some cases src1 is actually an immediate). We have summarized our changes in this commit, which also includes the above default calling convention change. It works for us for the moment, but it might not be an exhausive list of codegen fixes. If you like those changes, feel free to incorporate them into this PR.
We do notice some unaligned memory loads(i.e., LD tries to load from an address which is not 8-byte aligned), on our platform we have strict alignment rules, so just want to check: is it a goal for wasmer to generate aligned loads / stores?

Thanks for the awesome work!

Sep 11 '25 06:09 wakabat

Nice work! Huge kudos on making this happen!

Thank you!

target-lexicon used by wasmer don't have a default calling convention, so when we are doing cross-compilation, the code will always fail at those lines, by any means you feel like setting SystemV as the default calling convention for RISC-V?

Sounds good to me right now.

We do notice the emitter does not cover all cases of the code structure, for example, we hit codegen errors in SLL / SRL generations(in some cases src1 is actually an immediate). We have summarized our changes in this commit, which also includes the above default calling convention change. It works for us for the moment, but it might not be an exhausive list of codegen fixes. If you like those changes, feel free to incorporate them into this PR.

Again, thank you. Right now, I am going to cover more of the wasmer's test-cases on RISC-V target and I am going to cover all the missing pieces of the codegen.

We do notice some unaligned memory loads(i.e., LD tries to load from an address which is not 8-byte aligned), on our platform we have strict alignment rules, so just want to check: is it a goal for wasmer to generate aligned loads / stores? Thanks for the awesome work!

Interesting! To be honest, based on the search on the Internet, the architecture itself does not mandate if unaligned memory operations are allowed or not. However, as you described, there are definitely strict alignment boards in the wild. What do you use for testing? My VisionFive 2 board works fine when it comes to unaligned memory operations.

Sep 15 '25 20:09 marxin

Interesting! To be honest, based on the search on the Internet, the architecture itself does not mandate if unaligned memory operations are allowed or not. However, as you described, there are definitely strict alignment boards in the wild. What do you use for testing? My VisionFive 2 board works fine when it comes to unaligned memory operations.

Yes RISC-V spec do not require alignments of memory loads / stores, but I'm willing to bet there will be hardwares (maybe not VisionFive 2) that rely on alignments to simplify the implementation. We actually use a software RISC-V VM in which alignments help us significantly simplify the implementation. In addition, the LLVM compiler in wasmer fully respect alignments in the generated code.

Give those thought, can we at least make it an option / flag, which would generate aligned loads / stores when flipped on?

Sep 17 '25 22:09 wakabat

One more thing: there is a fix to dynasm which we will need: https://github.com/CensoredUsername/dynasm-rs/pull/118. The fix is released in dynasm 4.0.1. We'd appreciate it very much if you can help upgrade wasmer to use dynasm 4.0.1.

Sep 18 '25 05:09 wakabat

Hey so I can confirm that all previous issues have been fixed(well, expect for unaligned loads / stores), thank you!

We've been testing more on the code, and have put together some more fixes in this commit. As always, feel free to incorporate those changes into this PR.

Some of the fixes we applied include:

I've implemented unaligned loads / stores(you can search for emit_maybe_unaligned_load and emit_maybe_unaligned_store) for some cases. It's likely they still require changes but I consider them to be a starting point for more discussions.
There are some more codegen missing cases we found in our tests.
I wasn't so sure about all the (disp & 0x3) == 0 asserts in this code piece. To me it asserts that the offset part of a load / store operation is aligned to 4 bytes. But from the above discussion, it does seem that wasmer right now does not enforce alignments for loads / stores, so any particular reason that the offsets here are 4-byte aligned? In addition, if we do want to add alignment here, maybe we can align based on the bytes read? For example, when reading 2 bytes, we can expect the offset to be aligned by 2, when reading 8 bytes, we can expect the offset to be aligned by 8. And when reading a single byte, we should not need any alignment for the offset. What do you think?

Last, one of our tests hit on the issue that unreachable operator is not implemented yet. In other compilers, this would properly trigger a wasmer trap, but for the moment this is missing in the singlepass compiler for rv64 target, any plan for implementing this?

Again, thanks for all the awesome work!

Sep 21 '25 08:09 wakabat

Hey so I can confirm that all previous issues have been fixed(well, expect for unaligned loads / stores), thank you!

Glad for that :rocket:

We've been testing more on the code, and have put together some more fixes in this commit. As always, feel free to incorporate those changes into this PR.

Happily taking the changes with the following tweaks:

I've implemented unaligned loads / stores(you can search for emit_maybe_unaligned_load and emit_maybe_unaligned_store) for some cases. It's likely they still require changes but I consider them to be a starting point for more discussions.

I've taken the change, where I made a small refactoring connected to the fact we can emit the instructions in a loop.

There are some more codegen missing cases we found in our tests.

When it comes to shift operations, I recently added ImmType::Shift32 and ImmType::Shift64, and these types guarantee we won't emit an unexpected immediate values. Feel free to test my branch now.

I wasn't so sure about all the (disp & 0x3) == 0 asserts in this code piece. To me it asserts that the offset part of a load / store operation is aligned to 4 bytes. But from the above discussion, it does seem that wasmer right now does not enforce alignments for loads / stores, so any particular reason that the offsets here are 4-byte aligned? In addition, if we do want to add alignment here, maybe we can align based on the bytes read? For example, when reading 2 bytes, we can expect the offset to be aligned by 2, when reading 8 bytes, we can expect the offset to be aligned by 8. And when reading a single byte, we should not need any alignment for the offset. What do you think?

Correct, we should not expect any particular displacement value and the only correct (as far as I known) check is the run-time one (emit_maybe_unaligned_load/emit_maybe_unaligned_store).

Last, one of our tests hit on the issue that unreachable operator is not implemented yet. In other compilers, this would properly trigger a wasmer trap, but for the moment this is missing in the singlepass compiler for rv64 target, any plan for implementing this?

Hm, this should be correctly supported (and tested by various spec tests). Note RISC-V can't encode directly a payload into the unimp instructions and thus I save the payload as li.12 a0, payload as _. In the run-time library, the trap is correctly decoded and proper TrapCode is reported. Please provide a test-case where you won't get the expected output.

Generally speaking, I reached a point, where most of the wasmer tests are passing.

Would be great if you could test the codegen more intensively. Looking forward for a feedback.

Sep 21 '25 18:09 marxin

Last, one of our tests hit on the issue that unreachable operator is not implemented yet. In other compilers, this would properly trigger a wasmer trap, but for the moment this is missing in the singlepass compiler for rv64 target, any plan for implementing this?

Hm, this should be correctly supported (and tested by various spec tests). Note RISC-V can't encode directly a payload into the unimp instructions and thus I save the payload as li.12 a0, payload as _. In the run-time library, the trap is correctly decoded and proper TrapCode is reported. Please provide a test-case where you won't get the expected output.

I think the current code actually relies on an OS trap on unimp to happen, wasmer then capture the trap and translate it to correct TrapCode. I'm sure this works for native environment such as a real VisionFive 2 RISC-V board, but for embedded RISC-V chips in bare-metal environment, or in our case in a software VM, traps are not supported, unimp simply translates to a VM failure.

If we look at wasmer's LLVM code: https://github.com/wasmerio/wasmer/blob/817bc7af38da0ef41cdb977e685bb3652f5c1bf3/lib/compiler-llvm/src/translator/code.rs#L2383-L2387, Unreachable opcode in wasm actually translates to a wasmer trap call directly, which would be correctly picked up in those non-OS environments. Any chance we can change the implementation in the singlepass compiler to mirror the LLVM compiler?

Many thanks for all the other changes! They do look great!

Sep 22 '25 07:09 wakabat

Any chance we can change the implementation in the singlepass compiler to mirror the LLVM compiler?

Sure, that's definitely something that could be implemented by a direct call to the trap function. However, have you considered the other TrapCodes that are currently reported via the signalling mechanism: https://github.com/wasmerio/wasmer/blob/fac92435d61799a9b83fe8c3bb8c9edc1d91be94/lib/types/src/trapcode.rs#L23-L66 ?

How will it play with your environment?

Sep 29 '25 21:09 marxin

Hey thanks for the reply!

However, have you considered the other TrapCodes that are currently reported via the signalling mechanism

I think it might be a big ask, but ideally, I would appreciate it if we can have all of them wrapped properly in a wasmer lib trap. In fact some trap code is already processed this way in wasmer: https://github.com/wasmerio/wasmer/blob/fac92435d61799a9b83fe8c3bb8c9edc1d91be94/lib/vm/src/libcalls/eh/gcc.rs#L275

In practice, we might have to differ one trap code from another, as they might be processed differently:

IntegerDivisionByZero and likely IntegerOverflow are never thrown in RISC-V code, division by zero is handled at ISA level.
BadConversionToInteger is a non-issue for us since we don't use floating point number.
For trap codes like StackOverflow, HeapAccessOutOfBounds, HeapMisaligned, TableAccessOutOfBounds, IndirectCallToNull, BadSignature, we would simply treat the source program to be faulty, so even if current unimp instruction is used, we shall be fine. A VM(well, our RISC-V VM) level error will be thrown anyway.
UnreachableCodeReached is the one we need to be wrapped properly in a wasmer lib trap, since we want to explicitly detect that the wasm Unreachable opcode is called.
For UncaughtException, as long as unwinding is added, I think wasmer would throw this in a lib trap anyway.

Sep 30 '25 00:09 wakabat

All right, I switched the implementation of the Unreachable for the Singlepass in the PR to use the direct call to wasmer_vm_raise_trap.

IntegerDivisionByZero and likely IntegerOverflow are never thrown in RISC-V code, division by zero is handled at ISA level.

About this one, note the emitted code does an explicit check and jumps to the 2 generated labels connected to the TrapCodes. Thus, the signalling mechanism and the unimp instruction is hit. Please try to run a test-case that stress-tests the traps.

Happy to get feedback!

Oct 03 '25 21:10 marxin

Sorry for the delay, I can confirm that current tip in this PR works for all our existing tests. Thanks for the awesome work!

Oct 08 '25 01:10 wakabat

All right, I switched the implementation of the Unreachable for the Singlepass in the PR to use the direct call to wasmer_vm_raise_trap.

@wakabat Just wanted to let you know that I've decided to skip the change from this initial PR, but it's going to come shortly after we merge this PR as it's a change that influences other targets as well.

Addressed in #5821.

Oct 21 '25 15:10 marxin

@syrusakbary The PR is basically ready to be merged. However, it depends on a smaller tweaks to the CI when it comes to testing of riscv64gc target: #5817 #5815 #5809

Oct 22 '25 09:10 marxin

The CI job Run WAST tests on linux-riscv64 is green when running in CI under QEMU. Yay!

Oct 22 '25 13:10 marxin

The PR received a couple of fixed that very identified and reduced from running the RISC-V Singlepass compiler on tests for the following Rust crates:

aho-corasick
bstr
bytesize
glob
indexmap
itertools
itoa
memchr
quickcheck
rustc-demangle
rust-shlex
rust-url
ryu
strsim-rs
strum
time
tinyvec
unicode-ident
uuid

Nov 24 '25 12:11 marxin