htslib icon indicating copy to clipboard operation
htslib copied to clipboard

Better detection for avx512f functionality

Open ryandesign opened this issue 2 months ago • 1 comments

Building htslib 1.20 on OS X 10.11.6 with the version of clang included in Xcode 8.2.1 (Apple LLVM version 8.0.0 (clang-800.0.42.1)) configure says:

checking C compiler flags needed for sse4.1... -msse4.1 -mssse3 -mpopcnt
checking C compiler flags needed for avx2... -mavx2 -mpopcnt
checking C compiler flags needed for avx512f... -mavx512f -mpopcnt

and the build fails:

htscodecs/htscodecs/rANS_static32x16pr_avx512.c:79:42: warning: implicit declaration of function '_mm512_castsi512_si256' is invalid in C99 [-Wimplicit-function-declaration]
    _mm256_store_si256((__m256i *)(c),   _mm512_castsi512_si256(idx));
                                         ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:79:42: error: passing 'int' to parameter of incompatible type '__m256i' (vector of 4 'long long' values)
    _mm256_store_si256((__m256i *)(c),   _mm512_castsi512_si256(idx));
                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/Library/Developer/CommandLineTools/usr/bin/../lib/clang/8.0.0/include/avxintrin.h:813:42: note: passing argument to parameter '__a' here
_mm256_store_si256(__m256i *__p, __m256i __a)
                                         ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:80:42: warning: implicit declaration of function '_mm512_extracti64x4_epi64' is invalid in C99 [-Wimplicit-function-declaration]
    _mm256_store_si256((__m256i *)(c+8), _mm512_extracti64x4_epi64(idx, 1));
                                         ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:80:42: error: passing 'int' to parameter of incompatible type '__m256i' (vector of 4 'long long' values)
    _mm256_store_si256((__m256i *)(c+8), _mm512_extracti64x4_epi64(idx, 1));
                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Library/Developer/CommandLineTools/usr/bin/../lib/clang/8.0.0/include/avxintrin.h:813:42: note: passing argument to parameter '__a' here
_mm256_store_si256(__m256i *__p, __m256i __a)
                                         ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:103:12: warning: implicit declaration of function '_mm512_inserti64x4' is invalid in C99 [-Wimplicit-function-declaration]
    return _mm512_inserti64x4(_mm512_castsi256_si512(y0), y1, 1);
           ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:103:31: warning: implicit declaration of function '_mm512_castsi256_si512' is invalid in C99 [-Wimplicit-function-declaration]
    return _mm512_inserti64x4(_mm512_castsi256_si512(y0), y1, 1);
                              ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:103:12: error: returning 'int' from a function with incompatible result type '__m512i' (vector of 8 'long long' values)
    return _mm512_inserti64x4(_mm512_castsi256_si512(y0), y1, 1);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:112:42: error: passing 'int' to parameter of incompatible type '__m256i' (vector of 4 'long long' values)
    _mm256_store_si256((__m256i *)(c),   _mm512_castsi512_si256(idx));
                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/Library/Developer/CommandLineTools/usr/bin/../lib/clang/8.0.0/include/avxintrin.h:813:42: note: passing argument to parameter '__a' here
_mm256_store_si256(__m256i *__p, __m256i __a)
                                         ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:113:42: error: passing 'int' to parameter of incompatible type '__m256i' (vector of 4 'long long' values)
    _mm256_store_si256((__m256i *)(c+8), _mm512_extracti64x4_epi64(idx, 1));
                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Library/Developer/CommandLineTools/usr/bin/../lib/clang/8.0.0/include/avxintrin.h:813:42: note: passing argument to parameter '__a' here
_mm256_store_si256(__m256i *__p, __m256i __a)
                                         ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:115:12: warning: implicit declaration of function '_mm512_set_epi32' is invalid in C99 [-Wimplicit-function-declaration]
    return _mm512_set_epi32(b[c[15]], b[c[14]], b[c[13]], b[c[12]],
           ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:115:12: error: returning 'int' from a function with incompatible result type '__m512i' (vector of 8 'long long' values)
    return _mm512_set_epi32(b[c[15]], b[c[14]], b[c[13]], b[c[12]],
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:205:5: warning: implicit declaration of function '_mm512_load_si512' is invalid in C99 [-Wimplicit-function-declaration]
    LOAD512(Rv, ransN);
    ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:198:20: note: expanded from macro 'LOAD512'
    __m512i a##1 = _mm512_load_si512((__m512i *)&b[0]); \
                   ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:205:5: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
    LOAD512(Rv, ransN);
    ^~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:198:13: note: expanded from macro 'LOAD512'
    __m512i a##1 = _mm512_load_si512((__m512i *)&b[0]); \
            ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<scratch space>:249:1: note: expanded from here
Rv1
^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:205:5: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
    LOAD512(Rv, ransN);
    ^~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:199:13: note: expanded from macro 'LOAD512'
    __m512i a##2 = _mm512_load_si512((__m512i *)&b[16]);
            ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<scratch space>:249:1: note: expanded from here
Rv2
^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:217:22: warning: implicit declaration of function '_mm512_cvtepu8_epi32' is invalid in C99 [-Wimplicit-function-declaration]
        __m512i c1 = _mm512_cvtepu8_epi32(_mm256_extracti128_si256(c12,0));
                     ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:217:17: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
        __m512i c1 = _mm512_cvtepu8_epi32(_mm256_extracti128_si256(c12,0));
                ^    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:218:17: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
        __m512i c2 = _mm512_cvtepu8_epi32(_mm256_extracti128_si256(c12,1));
                ^    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:233:15: warning: implicit declaration of function '_mm512_maskz_compress_epi32' is invalid in C99 [-Wimplicit-function-declaration]
        Rp1 = _mm512_maskz_compress_epi32(gt_mask1, Rp1);
              ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:233:13: error: assigning to '__m512i' (vector of 8 'long long' values) from incompatible type 'int'
        Rp1 = _mm512_maskz_compress_epi32(gt_mask1, Rp1);
            ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:234:13: error: assigning to '__m512i' (vector of 8 'long long' values) from incompatible type 'int'
        Rp2 = _mm512_maskz_compress_epi32(gt_mask2, Rp2);
            ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:236:9: warning: implicit declaration of function '_mm512_mask_cvtepi32_storeu_epi16' is invalid in C99 [-Wimplicit-function-declaration]
        _mm512_mask_cvtepi32_storeu_epi16(ptr16-pc2, (1<<pc2)-1, Rp2);
        ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:242:15: warning: implicit declaration of function '_mm512_mask_srli_epi32' is invalid in C99 [-Wimplicit-function-declaration]
        Rv1 = _mm512_mask_srli_epi32(Rv1, gt_mask1, Rv1, 16);
              ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:256:43: warning: implicit declaration of function '_mm512_srli_epi64' is invalid in C99 [-Wimplicit-function-declaration]
        __m512i rf1_hm = _mm512_mul_epu32(_mm512_srli_epi64(Rv1,  32),
                                          ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:286:27: warning: implicit declaration of function '_mm512_srli_epi32' is invalid in C99 [-Wimplicit-function-declaration]
        __m512i shiftv1 = _mm512_srli_epi32(SDv1, 16);
                          ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:286:17: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
        __m512i shiftv1 = _mm512_srli_epi32(SDv1, 16);
                ^         ~~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:287:17: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
        __m512i shiftv2 = _mm512_srli_epi32(SDv2, 16);
                ^         ~~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:289:23: warning: implicit declaration of function '_mm512_srlv_epi32' is invalid in C99 [-Wimplicit-function-declaration]
        __m512i qv1 = _mm512_srlv_epi32(rfv1, shiftv1);
                      ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:303:5: warning: implicit declaration of function '_mm512_store_si512' is invalid in C99 [-Wimplicit-function-declaration]
    STORE512(Rv, ransN);
    ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:202:5: note: expanded from macro 'STORE512'
    _mm512_store_si512((__m256i *)&b[0], a##1); \
    ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:368:18: warning: implicit declaration of function '_mm512_load_epi32' is invalid in C99 [-Wimplicit-function-declaration]
    __m512i R1 = _mm512_load_epi32(&Rv[0]);
                 ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:368:13: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
    __m512i R1 = _mm512_load_epi32(&Rv[0]);
            ^    ~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:369:13: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
    __m512i R2 = _mm512_load_epi32(&Rv[16]);
            ^    ~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:391:31: warning: implicit declaration of function '_mm512_cvtepu16_epi32' is invalid in C99 [-Wimplicit-function-declaration]
      __m512i renorm_words1 = _mm512_cvtepu16_epi32(_mm256_loadu_si256((const __m256i *)sp)); // next 16 words
                              ^
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:391:15: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
      __m512i renorm_words1 = _mm512_cvtepu16_epi32(_mm256_loadu_si256((const __m256i *)sp)); // next 16 words
              ^               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:394:15: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
      __m512i f1 = _mm512_srli_epi32(S1, TF_SHIFT+8);
              ^    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
htscodecs/htscodecs/rANS_static32x16pr_avx512.c:395:15: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
      __m512i f2 = _mm512_srli_epi32(S2, TF_SHIFT+8);
              ^    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fatal error: too many errors emitted, stopping now [-ferror-limit=]
16 warnings and 20 errors generated.

10.11 log


On OS X 10.12.6 with the version of clang included in Xcode 9.2 (Apple LLVM version 9.0.0 (clang-900.0.39.2)) configure says:

checking C compiler flags needed for sse4.1... -msse4.1 -mssse3 -mpopcnt
checking C compiler flags needed for avx2... -mavx2 -mpopcnt
checking C compiler flags needed for avx512f... -mavx512f -mpopcnt

and the build succeeds.

10.12 log


On OS X 10.10.5 with the version of clang included in Xcode 7.2.1 (Apple LLVM version 7.0.2 (clang-700.1.81)) configure says:

checking C compiler flags needed for sse4.1... -msse4.1 -mssse3 -mpopcnt
checking C compiler flags needed for avx2... -mavx2 -mpopcnt
checking C compiler flags needed for avx512f... unsupported

and the build succeeds.

10.10 log


This makes me think that it might be possible to improve the configure test so that it can detect not only whether the compiler supports flags for avx512f but also whether they will work.

ryandesign avatar Apr 19 '24 21:04 ryandesign