simde
simde copied to clipboard
Macros and postincrement
When using the postincrement operator (i++) in an SIMDE function call, it sometimes gets incremented multiple times because of macro syntax. See this question for reference.
In my specific case, the function _mm_srli_epi32(InReg1, shift++)
ends up incrementing shift
by 4
only on the ARM platform. This is very difficult to debug as the code is working on Intel machines, and the code in question is nested very deep in a third party library.
I would recommend something like this. The code does work on ARM. It does not compile on Visual Studio, but it's also not required in VS.
#elif defined(SIMDE_ARM_NEON_A32V7_NATIVE)
#define simde_mm_srli_epi32(a, _imm8) \
({ typeof (_imm8) imm8 = (_imm8); \
(((imm8) <= 0) ? \
(a) : \
simde__m128i_from_neon_u32( \
((imm8) > 31) ? \
vandq_u32(simde__m128i_to_neon_u32(a), vdupq_n_u32(0)) : \
vshrq_n_u32(simde__m128i_to_neon_u32(a), ((imm8) & 31) | (((imm8) & 31) == 0)))); })