zstd icon indicating copy to clipboard operation
zstd copied to clipboard

Support AVX2 dynamic dispatch

Open embg opened this issue 3 years ago • 4 comments

We currently detect BMI2 instructions at runtime, but users can only benefit from AVX2 if they compile with -march=haswell. It would be nice to provide AVX2 support to users who are compiling with default options.

This issue is motivated specifically by this loop in ZSTD_copyCDictTableIntoCCtx() which was added as part of my short cache PR. Overall extDict compression speed at level 1 is 2-3% slower if that loop is compiled to SSE2 instructions vs AVX2 instructions.

There may be other functions which can be tagged for AVX2 dispatch in the future. I expect this issue would be closed after tagging ZSTD_copyCDictTableIntoCCtx(), and we can tag additional functions gradually.

embg avatar Dec 08 '22 16:12 embg

I already researched how we can safely detect AVX2 at runtime: https://stackoverflow.com/questions/72522885/are-the-xgetbv-and-cpuid-checks-sufficient-to-guarantee-avx2-support

embg avatar Dec 08 '22 16:12 embg

@ValZapod At least on Linux, x86 feature levels have become a thing. Some distributions such as CachyOS offer x86-64-v3 compiled repositories already which is very near to -march=haswell.

ms178 avatar Mar 14 '23 17:03 ms178