xsimd icon indicating copy to clipboard operation
xsimd copied to clipboard

Exploration of calling convention, pass by ref/value and forced inline on MSVC

Open marzer opened this issue 3 years ago • 0 comments

Hola. This PR is a spiritual successor to #646. I don't really intend it to be merged in (it's a massive diff), it's more that I've done the work to parameterize calling convention, pass by ref/value and forced inlines so they can be tested individually/together as desired, as these were all the variables discussed in that PR.

The main changes are:

  • added configurable XSIMD_FORCEINLINE, defaults to __forceinline on MSVC and __attribute((always_inline)) on GCC/Clang.
  • added configurable XSIMD_CREF, defaults to const&.
  • added configurable XSIMD_CALLCONV, defaults to __vectorcall when supported on MSVC, nothing otherwise.
  • applied XSIMD_FORCEINLINE to most of batch and batch_bool (basically all functions that were just one-liner calls to things in the kernel namespace)
  • replaced most (all?) instances of batch<> const& with batch<> XSIMD_CREF
  • applied XSIMD_CALLCONV to pretty much every function in the lib

I've tried to be pretty comprehensive, but there might be a few bits I missed. I ran a small subset of the xsimd benchmarks in just about every permutation of the above on MSVC (e.g. XSIMD_FORCEINLINE set to inline vs __forceinline et cetera). You can find the results here:

https://docs.google.com/spreadsheets/d/1c2Y7-NI36g8zZGumO5m0DpK5QLRh8RBmsgWhAcYDuag/edit#gid=0

You'll note there's not a huge difference between all the different permuations in Release builds. but I only did the add_fn() test from ops, and ignored the double values altogether, so it wasn't super comprehensive.

If you (or @amyspark) want to test additional elements yourself (or test the library in other ways with the parameterized attributes), you can just override the attributes using #defines, e.g.:

#define XSIMD_CALLCONV    __vectorcall
#define XSIMD_CREF        const&
#define XSIMD_FORCEINLINE inline

#include <xsimd/xsimd.hpp>

Disclaimer: I only changed what was necessary to get it to work on VS2022 for x64. I probably missed some ARM shenanigans so it might need extra work if you're testing there.

(I also drive-by moved XSIMD_NODISCARD to xsimd_config.hpp because it seems like it probably belongs there)

marzer avatar Jun 26 '22 22:06 marzer