eve
eve copied to clipboard
[FEATURE] Webassembly SIMD support
Is your feature request related to a problem? Please describe. I'm writing a library that is used both in-browser and as a standalone application. I'd love to be able to use SIMD to accelerate the computations.
Describe the solution you'd like A reasonably easy way to use simd on both x86 architectures and webassembly (through emscripten) that doesn't require too extensive platform specific code.
Describe alternatives you've considered I don't need simd that many places so I've considered trying to hand write the functions for webassembly and switch between them at compile time.
Additional context https://emscripten.org/docs/porting/simd.html is the documentation I've been referring to.
We were thinking about doing WebAsm but we were considering targeting proposed web assembly extensions. Not quite just yet though.
The doc you linked says:
Emscripten supports compiling existing codebases that use x86 SSE by passing the -msse directive to the compiler, and including the header <xmmintrin.h>.
Currently only the SSE1, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, and 128-bit AVX instruction sets are supported.
This approach with intrinsic conversion is a good idea and should hopefully tie you over.
On AVX we will by default select a 32 bit register, which means all sort of unpleasantness for you. I'd suggest sse4.2 for emscripten, we have a good support.
Example: eve::algo::inclusive_scan_inplace
vs. std::inclusive_scan
compiled for sse4.2
I'd to start with maybe running all of the eve's tests
ninja -j 8 unit.exe && ctest -j 8
for prod build please do not forget -DNDEBUG, we assert quite extensively.
Do not hesitate to reach out, we are happy for people to try out the library.
On top of creating issues, you can:
cpplang slack: jfalcou, dyaroshev email: [email protected], [email protected] twitter: @CppSpelunker, @dyaroshev
Thanks for the quick reply. I'll see if I can get setup and benchmarked this weekend.
:+1: I personally have zero doubts that the first attempt will fail, please reach out.
Can't we consider wasm simd as a separate target architecture ?
Can't we consider wasm simd as a separate target architecture ?
Sure we can and should. Porting will take a long time though. But @ruler501 - wants to use eve now and the cross compilation should be a solid option.
I did what you suggested (added -msse -msse2 -msse3 -msse4
to compile flags and included <xmmintrin.h>
and now I'm getting following compile errors:
|| In file included from /Users/dodo/Work/Microblink/core-eve/jfalcou-eveTest/Source/EveTest.cpp:39:
|| In file included from /Users/dodo/Work/Microblink/core-eve/eve/include/eve/wide.hpp:18:
|| In file included from /Users/dodo/Work/Microblink/core-eve/eve/include/eve/arch/wide.hpp:10:
|| In file included from /Users/dodo/Work/Microblink/core-eve/eve/include/eve/arch/cpu/wide.hpp:10:
|| In file included from /Users/dodo/Work/Microblink/core-eve/eve/include/eve/arch/as_register.hpp:13:
eve/include/eve/arch/x86/as_register.hpp|53 col 65| error: use of undeclared identifier '__m256d'
|| if constexpr(std::is_same_v<Type,double> ) return __m256d{};
|| ^
eve/include/eve/arch/x86/as_register.hpp|53 col 72| error: expected ';' after return statement
|| if constexpr(std::is_same_v<Type,double> ) return __m256d{};
|| ^
|| ;
eve/include/eve/arch/x86/as_register.hpp|54 col 9| error: expected expression
|| else if constexpr(std::is_same_v<Type,float > ) return __m256{};
|| ^
eve/include/eve/arch/x86/as_register.hpp|55 col 9| error: expected expression
|| else if constexpr(std::is_integral_v<Type> ) return __m256i{};
|| ^
eve/include/eve/arch/x86/as_register.hpp|71 col 65| error: use of undeclared identifier '__m512d'
|| if constexpr(std::is_same_v<Type,double> ) return __m512d{};
|| ^
eve/include/eve/arch/x86/as_register.hpp|71 col 72| error: expected ';' after return statement
|| if constexpr(std::is_same_v<Type,double> ) return __m512d{};
|| ^
|| ;
eve/include/eve/arch/x86/as_register.hpp|72 col 9| error: expected expression
|| else if constexpr(std::is_same_v<Type,float > ) return __m512{};
|| ^
eve/include/eve/arch/x86/as_register.hpp|73 col 9| error: expected expression
|| else if constexpr(std::is_integral_v<Type> ) return __m512i{};
|| ^
eve/include/eve/arch/x86/as_register.hpp|87 col 54| error: unknown type name '__mmask8'
|| template<> struct inner_mask<8> { using type = __mmask8; };
|| ^
eve/include/eve/arch/x86/as_register.hpp|88 col 54| error: unknown type name '__mmask16'
|| template<> struct inner_mask<16> { using type = __mmask16; };
|| ^
eve/include/eve/arch/x86/as_register.hpp|89 col 54| error: unknown type name '__mmask32'
|| template<> struct inner_mask<32> { using type = __mmask32; };
|| ^
eve/include/eve/arch/x86/as_register.hpp|90 col 54| error: unknown type name '__mmask64'
|| template<> struct inner_mask<64> { using type = __mmask64; };
|| ^
... and much more similar.
It appears that EVE is attempting to use 256-bit AVX. Is there a way to prevent that?
Normally, EVE detects and includes the SSE extensions by itself. You should not have to include ximmintrin manually as we'll do it for you.
Now, we use SPY to detect extensions so maybe I need to fix SPY for emscripten ?
Possibly.
I've created a little test like this:
TEST( EveTest, supportsSIMD )
{
#if defined( __EMSCRIPTEN__ ) && !defined( __wasm_simd128__ )
EXPECT_EQ( eve::current_api, spy::undefined_simd_ );
EXPECT_FALSE( eve::supports_simd );
#else
EXPECT_NE( eve::current_api, spy::undefined_simd_ );
EXPECT_TRUE( eve::supports_simd );
#endif
}
It always fails on emscripten:
- if building without WASM SIMD support,
EXPECT_EQ( eve::current_api, spy::undefined_simd_ );
passes, butEXPECT_FALSE( eve::supports_simd );
fails - without SIMD support, EVE should report that SIMD is not supported and scalarize everything (ironically, it appears it does so, as my other test that performs some pixel manipulation works) - if building with WASM SIMD support,
EXPECT_NE( eve::current_api, spy::undefined_simd_ );
fails (I guess this is expected, as EVE does not yet know about WASM SIMD), andEXPECT_TRUE( eve::supports_simd );
passes, as expected
If no SIMD is detected, all operations are emulated indeed.
Could you add a std::cout << eve::current_api << "\n"
to your test to see which API is detected.
Alternatively, maybe we should first see if SPY (https://github.com/jfalcou/spy) works with emscripten by running the SPY test with it.
If I build without -msse2
, the eve::current_api
prints Undefined SIMD instructions set
. If I build with -msse2
, but comment-out all usages of eve::wide
, it prints X86 SSE2
. If I don't comment-out the usages and includes of eve::wide
, compilation fails with the error above.
Btw, emscripten 2.0.31, which I'm using, is based on LLVM 14, but it still uses libc++ from LLVM 12, which doesn't provide support for concepts STL library.
I've worked around this by backporting required stuff from LLVM 13 (which is released and fully supports concepts STL lib).
If you need it for your testing, grab it here: llvm-stl-polyfill.zip
Just unzip it and put it somewhere in your include path.
Same trick is also needed when building with Android NDK r23 (also based on LLVM 12) and with Xcode 13 (iOS and macOS; also based on LLVM 12).
Yeah the libc++ support is still flaky. As for the issue, I'll need to test manually as it'll get faster than me telling you to poke here and there. What I don't understand is that we include <immintrin.h> that normally contains the definition of all x86 SIMD types so I quite don't get why it whines about the non SSE. Or it means the WASM SIMD files are not made to support post SSE types ?
Or it means the WASM SIMD files are not made to support post SSE types ?
Yes.
From the Emscripten documentation:
Only the 128-bit wide instructions from AVX instruction set are available. 256-bit wide AVX instructions are not provided.
The xmmintrin.h
that ships with Emscripten supports only 128-bit SSE and AVX types and emulates them into WASM SIMD instructions. This is the table that explains how efficient the emulation for certain intrinsic is (some of them run at native speed, some of them are emulated with different WASM SIMD instructions and some are scalarized).
Also, you can find the Emscripten's SSE emulation header here - it may also provide a good baseline for proper WASM SIMD support in EVE.
Emscripten also provides other emulation headers, for easier porting of ARM NEON code as well. You can see how it works here, but SSE and SSE2 emulation works best (most of the intrinsics are 1:1 with WASM SIMD).
Lack of intrinsics even defined is annoying. I dealt with this when doing a clang based refactoring at one point.
My solution was to add using intr_name = int
or smth like that.
But it's nasty for real use-case.
You can try putting it here: https://github.com/jfalcou/eve/blob/26d50d59190d3e2817c3ca95614e3e74f6c6f30a/include/eve/arch/x86/predef.hpp#L61
Should be similar to this for arm: https://github.com/jfalcou/eve/blob/26d50d59190d3e2817c3ca95614e3e74f6c6f30a/include/eve/arch/arm/predef.hpp#L19
THis is obviously a hack so I can't tell you if it will work or not.
Maybe we can actually modify spec.hpp for x86 by wrapping the >128 test in a macro that detects emscipten properly ?
My opinion:
We should try to hack x86 to unblock @DoDoENT If that takes too much code changes and is too hard, we should just hook up wasm as a proper target.
The initial commit for that is big but not that big. It will go through emulation most of the time, which will be bad (that's why I suggest hacking x86 at the moment) but we can fix the gaps as we go along.
@DenisYaroshevskiy , this appears to be working for me at the moment. It compiles and runs and actually uses WASM SIMD (tested on Browser that doesn't support SIMD (Safari) and it fails to run the binary, as expected, and it works correctly on Chrome).
I'll make some more tests on my side (I'm still learning the API, the documentation is pretty scarce - I'm basically doing everything based on this example) and add more stubs if needed. Afterward, I can create a PR.
The fact that you have to stab functions as well as registers is non sustainable.
I will try to have a look at actually hooking up the WASM implementation proper this weekend. Unfortunately @jfalcou knows that stuff much better than I do but he's busy with other things. This will not give a complete eve solution but it will work and be converted as we go along.
What I think can work in a mean time is, instead of writing the wrappers yourself, you can try hooking up Simd Everywhere, which has WASM <=> x86 emulation complete.
https://github.com/simd-everywhere/simde
Defining SIMDE_ENABLE_NATIVE_ALIASES
+ including x86 headers probably is how it's done.
With respect to documentation being scarce, yeah - feel free to report an issue or ask on C++ lang Slack "@jfalcou" or "@dyaroshev" - we can sketch you an example.
Given that @DoDoENT discovered eve doesn't work for them - will push this back a bit
Given my schedule in the upcoming month at least I don't see this happening soon.
We need to get a proper testing harness first anyway