Exploration of calling convention, pass by ref/value and forced inline on MSVC
Hola. This PR is a spiritual successor to #646. I don't really intend it to be merged in (it's a massive diff), it's more that I've done the work to parameterize calling convention, pass by ref/value and forced inlines so they can be tested individually/together as desired, as these were all the variables discussed in that PR.
The main changes are:
- added configurable
XSIMD_FORCEINLINE, defaults to__forceinlineon MSVC and__attribute((always_inline))on GCC/Clang. - added configurable
XSIMD_CREF, defaults toconst&. - added configurable
XSIMD_CALLCONV, defaults to__vectorcallwhen supported on MSVC, nothing otherwise. - applied
XSIMD_FORCEINLINEto most ofbatchandbatch_bool(basically all functions that were just one-liner calls to things in the kernel namespace) - replaced most (all?) instances of
batch<> const&withbatch<> XSIMD_CREF - applied
XSIMD_CALLCONVto pretty much every function in the lib
I've tried to be pretty comprehensive, but there might be a few bits I missed. I ran a small subset of the xsimd benchmarks in just about every permutation of the above on MSVC (e.g. XSIMD_FORCEINLINE set to inline vs __forceinline et cetera). You can find the results here:
https://docs.google.com/spreadsheets/d/1c2Y7-NI36g8zZGumO5m0DpK5QLRh8RBmsgWhAcYDuag/edit#gid=0
You'll note there's not a huge difference between all the different permuations in Release builds. but I only did the add_fn() test from ops, and ignored the double values altogether, so it wasn't super comprehensive.
If you (or @amyspark) want to test additional elements yourself (or test the library in other ways with the parameterized attributes), you can just override the attributes using #defines, e.g.:
#define XSIMD_CALLCONV __vectorcall
#define XSIMD_CREF const&
#define XSIMD_FORCEINLINE inline
#include <xsimd/xsimd.hpp>
Disclaimer: I only changed what was necessary to get it to work on VS2022 for x64. I probably missed some ARM shenanigans so it might need extra work if you're testing there.
(I also drive-by moved XSIMD_NODISCARD to xsimd_config.hpp because it seems like it probably belongs there)