simde icon indicating copy to clipboard operation
simde copied to clipboard

Support a no-immediates mode

Open nemequ opened this issue 3 years ago • 2 comments

One use case people have found for SIMDe which I honestly never anticipated is using it to implement run-time emulation. One problem with this is that there a bunch of functions with immediate-mode parameters which must be know at compile time as they are actually encoded in the instruction instead of referencing a register.

It shouldn't actually be too difficult to support this, we would just have to avoid code paths which require a constant if a certain macro is defined (e.g., SIMDE_NO_IMMEDIATES). Obviously it would also have to disable the constant checking in macros (SIMDE_REQUIRE_CONSTANT, SIMDE_REQUIRE_CONSTANT_RANGE).

So, for example, simde_mm_srai_epi16 might look like

SIMDE_FUNCTION_ATTRIBUTES
simde__m128i
simde_mm_srai_epi16 (simde__m128i a, const int imm8)
    SIMDE_REQUIRE_CONSTANT_RANGE(imm8, 0, 255) {
  /* MSVC requires a range of (0, 255). */
  #if defined(SIMDE_X86_SSE2_NATIVE)
    return _mm_sra_epi16(a, _mm_cvtsi32_si128(imm8));
  #else
    simde__m128i_private
      r_,
      a_ = simde__m128i_to_private(a);

    const int cnt = (imm8 & ~15) ? 15 : imm8;

    #if defined(SIMDE_ARM_NEON_A32V7_NATIVE)
      r_.neon_i16 = vshlq_s16(a_.neon_i16, vdupq_n_s16(HEDLEY_STATIC_CAST(int16_t, -cnt)));
    #elif defined(SIMDE_WASM_SIMD128_NATIVE)
      r_.wasm_v128 = wasm_i16x8_shr(a_.wasm_v128, cnt);
    #else
      SIMDE_VECTORIZE
      for (size_t i = 0 ; i < (sizeof(r_) / sizeof(r_.i16[0])) ; i++) {
        r_.i16[i] = a_.i16[i] >> cnt;
      }
    #endif

    return simde__m128i_from_private(r_);
  #endif
}
#if defined(SIMDE_X86_SSE2_NATIVE) && !defined(SIMDE_NO_IMMEDIATES)
  #define simde_mm_srai_epi16(a, imm8) _mm_srai_epi16((a), (imm8))
#elif defined(SIMDE_ARM_NEON_A32V7_NATIVE) && !defined(SIMDE_NO_IMMEDIATES)
  #define simde_mm_srai_epi16(a, imm8) vshrq_n_s16((a), (imm8))
#endif
#if defined(SIMDE_X86_SSE2_ENABLE_NATIVE_ALIASES)
  #define _mm_srai_epi16(a, imm8) simde_mm_srai_epi16(a, imm8)
#endif

Notice the additional checks before defining the macros after the function definition and, even though there are immediate-mode implementations for NEON and SSE2, there are duplicate implementations inside the function body for when SIMDE_NO_IMMEDIATES is defined.

I can't really think of a good way to test this without a bunch of ifdefs in the tests which I don't want to do. We can at least add a CI check to make sure the code is correct, I just can't think of a way to automatically make sure that we accept non-constant values, so I would expect occasional bugs (which would be easy to fix).

CC @danoon2 & @EvgeniySpinov. What do you two think; does this sound useful for you?

nemequ avatar Jan 23 '21 23:01 nemequ

That is a tough one, too bad the cpp method signature isn't different for an integer constant vs a variable int. Since I don't need this check and I use SIMDE_NO_CHECK_IMMEDIATE_CONSTANT, I think it might be ok to assume that if someone is using SIMDE_NO_CHECK_IMMEDIATE_CONSTANT that they are responsible for their own checks.

danoon2 avatar Jan 24 '21 16:01 danoon2

My case is relatively simple (emulation of SSE 4.1, 4.2, AVX instruction sets for Windows API during runtime), I do not see any issue with the current constant approach as I know supported instructions during compilation times.

Likely I'm not very representative person here or I'm missing something?

EvgeniySpinov avatar Jan 26 '21 22:01 EvgeniySpinov