xsimd icon indicating copy to clipboard operation
xsimd copied to clipboard

Proposal for `batch_constant` biwise shift.

Open AntoinePrv opened this issue 1 month ago • 1 comments

There is a trick borrowed from Lemire that I am using to implement left shift when unavailable: multiply by 2^n. (The rest of the trick is that for right shifting, if you known that you data will "fit", you can left shift everything by some amount followed by a constant right shift on the whole batch).

With this, we can add missing variable shift on x86_64 for:

  • SSE2:
    • left uint16_t
    • left uint32_t
  • AVX2:
    • left uint16_t

On top of this, we can implement variable shift by calling twice a shift twice + mask on an integer size twice as big, so that further enables:

  • SSE2:
    • left uint8_t
  • AVX2:
    • left uint8_t
    • right uint16_t
    • right uint8_t

The main drawback is adding a new batch_constant API for left shift (more code to re-dispatch in the general case). This is necessary because we need the left shift to create the power of two.

I think this is still useful because it enables some shifts for the whole SSE inheritance.

I have a working partial implementation that I could turn into a PR in the next couple days, and hopefully into the 14.0 release. What do you think @serge-sans-paille?

AntoinePrv avatar Nov 21 '25 10:11 AntoinePrv

For left shifting, that should probably be the generic default implementation, right? I'm fine with that either way.

serge-sans-paille avatar Dec 04 '25 13:12 serge-sans-paille