rvv-bench
rvv-bench copied to clipboard
Missing valid instructions
- Widening reductions (
vfwredosum.vs
,vfwredusum.vs
,vwredsum.vs
,vwredsumu.vs
) should allow LMUL=8 -
vrgatherei16.vv
should only disallow LMUL=8 for e8
The first point should be fixed now, thanks.
The current design just masks standalone LMUL and SEW, so I can't add LMUL=8
SEW!=8
, vrgatherei16.vv
for now. I'll have to look into restructuring the code, or allowing a special cases.
The latest commit 7b3f7b6 fixes this, I'll update the results page soon.
I've now updated the instruction cycle count measurement code to remove the destination vector dependency that processors without vl prediction suffer from. This fixes the weird 4, 4, 5, 8
LMUL scaling on the C908, with proper measurements it's now 1, 2, 4, 8
.
Implementing that required a big rewrite and yet another preprocessor, but the code now allows for fine-grained SEW and LMUL masking.
A couple months ago I was working on a very-generated risc-v/rvv instruction benchmarking thing (as in, JS generates 570MB of assembly which then becomes a 129MB binary) and it's largely complete (having things like separate throughput & latency tests, register cycling galore for removing dependencies (also tests intentionally leaving them in), precise argument initialization, testing different constants where applicable; rather inspired from uops.info) but I just kinda stopped working on it and haven't published it (doesn't help that I have no risc-v hardware).
here's a screenshot of qemu (VLEN=256) timings in it :)
Probably should put in the last bits of effort on that, but the leftovers are rather annoying (making the UI less clunky; displaying multiple archs in the same table (questions on vl/vlmax matching); probably 0.7.1 support otherwise the previous point is somewhat pointless; have latency tests between different-width/type operands which is a massive mess largely due to it being technically impossible to do properly)
The table was also inspired by uops.info, but mine is way less sophisticated then that or yours.
I was looking into adding RISC-V support to llvm-exegesis, which does something very similar, but I need to have bare-metal support to test on RTL simulations.
probably 0.7.1 support otherwise the previous point is somewhat pointless
I'm planning to drop 0.7.1 support soon, since I've not got two rvv 1.0 boards, don't have ssh access to a C920 anymore, and there will be some more releases this year.