simde icon indicating copy to clipboard operation
simde copied to clipboard

add optimized implementations using RISC-V vector intrinsics

Open mr-c opened this issue 8 months ago • 5 comments

(I don't plan on doing this myself, but I wanted to start the conversation to see who is interested in doing this)

What

Use RISC-V vector intrinsics to provide optimized implementations of the existing intrinsics (X86, ARM Neon, MIPS MSA, WASM, etc.) already in SIMD Everywhere.

Existing work

  • SSE: https://github.com/FeddrickAquino/sse2rvv (assumes VLEN of 128bits).
  • NEON: https://doi.org/10.48550/arXiv.2309.16509 (requires clang 17 ; SIMDe source code pending)
  • NEON: https://github.com/howjmay/neon2rvv

When to start

The vector extensions themselves were ratified in 2021. The intrinsics for using them from C/C++ are nearly ratified (see below), therefore we can start accepting contributions now.

Hopefully we will have a ratified specification by the end of this year.

(source)

Takes 45 days as the public review period (till mid-December)

(source)

Recent draft: https://github.com/riscv-non-isa/rvv-intrinsic-doc/releases/download/draft-20231014-c10de5388709b000ecc4becb0d9ee16baa0141a9/v-intrinsic-spec.pdf (latest drafts)

https://github.com/riscv-non-isa/rvv-intrinsic-doc

Which compilers to test?

Upcoming LLVM 17 and GCC trunk supports v0.12, which is expected to be identical to the to-be-frozen intrinsic specification.

Clang 16 and GCC 13 supports the v0.11 version, which does not have tuple type segment load/store intrinsics, fixed-point intrinsics with rounding mode parameter, and floating-point intrinsics with rounding mdoe parameter.

Benchmarking

Maybe autovectorization is good enough. Hand written implementations should both be compared by the number of instructions and on real-world performance.

Please share any suggestions for publicly available RISC-V Vector 1.0 systems.

https://riscv.org/risc-v-developer-boards/details/

https://www.riscfive.com/risc-v-development-boards/ lists some boards with the V extension, but I can't find a public declaration that any of them follow the 1.0 version of the vector extension.

According to https://doi.org/10.48550/arXiv.2210.08882 , the following cores implement v1.0 of the RISC-V Vector Extension: SiFive X280, Andes NX27V, Atrevido 220. Notably for the riscfive.com list of dev boards, the XuanTie 910 core is RVV version 0.7.1.

mr-c avatar Oct 19 '23 15:10 mr-c