simdutf8 icon indicating copy to clipboard operation
simdutf8 copied to clipboard

Add AVX 512 support

Open hkratz opened this issue 4 years ago • 3 comments

  • Use 256-bit registers.
  • Use masked load if possible

hkratz avatar Apr 21 '21 08:04 hkratz

AVX-512 intrinsics are currently nightly only and the speedup potential is unclear. Furthermore AVX throttling needs be taken into consideration.

hkratz avatar Apr 26 '21 21:04 hkratz

Throttling is a concern but presumably only on wide (512-bit) registers as @travisdowns explained well in his answer. Stick with 256-bit and you'll be fine (in this instance, there are no heavy instructions involved).

lemire avatar Apr 27 '21 01:04 lemire

Newer client chips (e.g. Ice, Tiger and Rocket Lakes) work a bit differently (heavy vs light distinction disappears, only width seems to matter) but regardless Daniel's advice still applies: you shouldn't see license-based throttling with 256-bit ops.

travisdowns avatar Apr 27 '21 02:04 travisdowns