opus icon indicating copy to clipboard operation
opus copied to clipboard

build failed when disable AVX on Windows with MSVC clang-cl compiler

Open wangyoucao577 opened this issue 2 years ago • 8 comments

Hi,

I'm trying to disable AVX on windows compilation to compatible with old CPUs that no AVX instructions support. I uses clang-cl compiler(clang 12) that installed in latest Visual Studio 2022, but the build is failed. Here's the reproduction and error logs.

> mkdir -p build && cd build 
> cmake .. -GNinja -DCMAKE_C_COMPILER=clang-cl -DOPUS_X86_MAY_HAVE_AVX=OFF
> ninja
[145/149] Building C object CMakeFiles\opus.dir\silk\x86\VQ_WMat_EC_sse4_1.c.obj
FAILED: CMakeFiles/opus.dir/silk/x86/VQ_WMat_EC_sse4_1.c.obj
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\bin\clang-cl.exe  /nologo -DENABLE_HARDENING -DHAVE_CONFIG_H -DHAVE_LRINT -DHAVE_LRINTF -DOPUS_BUILD -DOPUS_HAVE_RTCD -DOPUS_X86_MAY_HAVE_SSE -DOPUS_X86_MAY_HAVE_SSE2 -DOPUS_X86_MAY_HAVE_SSE4_1 -DVAR_ARRAYS -D_CRT_SECURE_NO_WARNINGS -IE:\opus\include -IE:\opus\build -IE:\opus -IE:\opus\celt -IE:\opus\silk -IE:\opus\silk\float /DWIN32 /D_WINDOWS /W3 /MDd /Zi /Ob0 /Od /RTC1 /GS /showIncludes /FoCMakeFiles\opus.dir\silk\x86\VQ_WMat_EC_sse4_1.c.obj /FdCMakeFiles\opus.dir\opus.pdb -c -- E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c
E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c(88,26): error: always_inline function '_mm_cvtepi8_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_VQ_WMat_EC_sse4_1' that is compiled without support for 'sse4.1'
        v_cb_row_31_Q7 = OP_CVTEPI8_EPI32_M32( &cb_row_Q7[ 1 ] );
                         ^
E:\opus\celt/x86/x86cpu.h(68,3): note: expanded from macro 'OP_CVTEPI8_EPI32_M32'
 (_mm_cvtepi8_epi32(_mm_cvtsi32_si128(OP_LOADU_EPI32(x))))
  ^
E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c(90,26): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_VQ_WMat_EC_sse4_1' that is compiled without support for 'sse4.1'
        v_cb_row_31_Q7 = _mm_mul_epi32( v_XX_31_Q17, v_cb_row_31_Q7 );
                         ^
E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c(91,26): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_VQ_WMat_EC_sse4_1' that is compiled without support for 'sse4.1'
        v_cb_row_42_Q7 = _mm_mul_epi32( v_XX_42_Q17, v_cb_row_42_Q7 );
                         ^
3 errors generated.
[147/149] Building C object CMakeFiles\opus.dir\silk\x86\NSQ_sse4_1.c.obj
FAILED: CMakeFiles/opus.dir/silk/x86/NSQ_sse4_1.c.obj
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\bin\clang-cl.exe  /nologo -DENABLE_HARDENING -DHAVE_CONFIG_H -DHAVE_LRINT -DHAVE_LRINTF -DOPUS_BUILD -DOPUS_HAVE_RTCD -DOPUS_X86_MAY_HAVE_SSE -DOPUS_X86_MAY_HAVE_SSE2 -DOPUS_X86_MAY_HAVE_SSE4_1 -DVAR_ARRAYS -D_CRT_SECURE_NO_WARNINGS -IE:\opus\include -IE:\opus\build -IE:\opus -IE:\opus\celt -IE:\opus\silk -IE:\opus\silk\float /DWIN32 /D_WINDOWS /W3 /MDd /Zi /Ob0 /Od /RTC1 /GS /showIncludes /FoCMakeFiles\opus.dir\silk\x86\NSQ_sse4_1.c.obj /FdCMakeFiles\opus.dir\opus.pdb -c -- E:\opus\silk\x86\NSQ_sse4_1.c
E:\opus\silk\x86\NSQ_sse4_1.c(345,22): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    a_Q12_01234567 = _mm_shuffle_epi8( a_Q12_01234567, xmm_one );
                     ^
E:\opus\silk\x86\NSQ_sse4_1.c(346,22): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    a_Q12_89ABCDEF = _mm_shuffle_epi8( a_Q12_89ABCDEF, xmm_one );
                     ^
E:\opus\silk\x86\NSQ_sse4_1.c(357,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempa = _mm_shuffle_epi8( xmm_tempa, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(358,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempb = _mm_shuffle_epi8( xmm_tempb, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(366,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempa = _mm_shuffle_epi8( xmm_tempa, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(367,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempb = _mm_shuffle_epi8( xmm_tempb, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(376,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempa = _mm_shuffle_epi8( xmm_tempa, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(377,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempb = _mm_shuffle_epi8( xmm_tempb, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(394,33): error: '__builtin_ia32_palignr128' needs target feature ssse3
        psLPC_Q14_hi_89ABCDEF = _mm_alignr_epi8( psLPC_Q14_hi_01234567, psLPC_Q14_hi_89ABCDEF, 2 );
                                ^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\tmmintrin.h(148,12): note: expanded from macro '_mm_alignr_epi8'
  (__m128i)__builtin_ia32_palignr128((__v16qi)(__m128i)(a), \
           ^
E:\opus\silk\x86\NSQ_sse4_1.c(395,33): error: '__builtin_ia32_palignr128' needs target feature ssse3
        psLPC_Q14_lo_89ABCDEF = _mm_alignr_epi8( psLPC_Q14_lo_01234567, psLPC_Q14_lo_89ABCDEF, 2 );
                                ^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\tmmintrin.h(148,12): note: expanded from macro '_mm_alignr_epi8'
  (__m128i)__builtin_ia32_palignr128((__v16qi)(__m128i)(a), \
           ^
E:\opus\silk\x86\NSQ_sse4_1.c(442,30): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
                b_Q14_3210 = OP_CVTEPI16_EPI32_M64( b_Q14 );
                             ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_sse4_1.c(450,29): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
                xmm_tempa = _mm_mul_epi32( xmm_tempa, b_Q14_3210 );
                            ^
E:\opus\silk\x86\NSQ_sse4_1.c(455,37): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
                pred_lag_ptr_0123 = _mm_mul_epi32( pred_lag_ptr_0123, b_Q14_0123 );
                                    ^
E:\opus\silk\x86\NSQ_sse4_1.c(623,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
            xmm_xq_Q14_3210 = _mm_mul_epi32( xmm_xq_Q14_3210, xmm_Gain_Q10 );
                              ^
E:\opus\silk\x86\NSQ_sse4_1.c(624,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
            xmm_xq_Q14_x3x1 = _mm_mul_epi32( xmm_xq_Q14_x3x1, xmm_Gain_Q10 );
                              ^
E:\opus\silk\x86\NSQ_sse4_1.c(625,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
            xmm_xq_Q14_7654 = _mm_mul_epi32( xmm_xq_Q14_7654, xmm_Gain_Q10 );
                              ^
E:\opus\silk\x86\NSQ_sse4_1.c(626,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
            xmm_xq_Q14_x7x5 = _mm_mul_epi32( xmm_xq_Q14_x7x5, xmm_Gain_Q10 );
                              ^
E:\opus\silk\x86\NSQ_sse4_1.c(633,31): error: '__builtin_ia32_pblendw128' needs target feature sse4.1
            xmm_xq_Q14_3210 = _mm_blend_epi16( xmm_xq_Q14_3210, xmm_xq_Q14_x3x1, 0xCC );
                              ^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\smmintrin.h(516,13): note: expanded from macro '_mm_blend_epi16'
  (__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
            ^
E:\opus\silk\x86\NSQ_sse4_1.c(634,31): error: '__builtin_ia32_pblendw128' needs target feature sse4.1
            xmm_xq_Q14_7654 = _mm_blend_epi16( xmm_xq_Q14_7654, xmm_xq_Q14_x7x5, 0xCC );
                              ^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\smmintrin.h(516,13): note: expanded from macro '_mm_blend_epi16'
  (__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
            ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
[148/149] Building C object CMakeFiles\opus.dir\silk\x86\NSQ_del_dec_sse4_1.c.obj
FAILED: CMakeFiles/opus.dir/silk/x86/NSQ_del_dec_sse4_1.c.obj
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\bin\clang-cl.exe  /nologo -DENABLE_HARDENING -DHAVE_CONFIG_H -DHAVE_LRINT -DHAVE_LRINTF -DOPUS_BUILD -DOPUS_HAVE_RTCD -DOPUS_X86_MAY_HAVE_SSE -DOPUS_X86_MAY_HAVE_SSE2 -DOPUS_X86_MAY_HAVE_SSE4_1 -DVAR_ARRAYS -D_CRT_SECURE_NO_WARNINGS -IE:\opus\include -IE:\opus\build -IE:\opus -IE:\opus\celt -IE:\opus\silk -IE:\opus\silk\float /DWIN32 /D_WINDOWS /W3 /MDd /Zi /Ob0 /Od /RTC1 /GS /showIncludes /FoCMakeFiles\opus.dir\silk\x86\NSQ_del_dec_sse4_1.c.obj /FdCMakeFiles\opus.dir\opus.pdb -c -- E:\opus\silk\x86\NSQ_del_dec_sse4_1.c
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(409,18): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
    a_Q12_0123 = OP_CVTEPI16_EPI32_M64( a_Q12 );
                 ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(410,18): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
    a_Q12_4567 = OP_CVTEPI16_EPI32_M64( a_Q12 + 4 );
                 ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(413,22): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
        a_Q12_89AB = OP_CVTEPI16_EPI32_M64( a_Q12 + 8 );
                     ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(414,22): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
        a_Q12_CDEF = OP_CVTEPI16_EPI32_M64( a_Q12 + 12 );
                     ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(418,22): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
        b_Q12_0123 = OP_CVTEPI16_EPI32_M64( b_Q14 );
                     ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(433,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                tmpa                = _mm_mul_epi32( pred_lag_ptr_tmp, b_Q12_0123 );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(437,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                pred_lag_ptr_tmp    = _mm_mul_epi32( pred_lag_ptr_tmp, b_sr_Q12_0123 );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(488,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                tmpa            = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_0123 );    /* 0, -1, -2, -3 * 0123 -> 0*0, 2*-2 */
                                  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(495,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                psLPC_Q14_tmp   = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp ); /* 1*-1, 3*-3 */
                                  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(502,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                tmpa            = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_4567 );
                                  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(508,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                psLPC_Q14_tmp   = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp );
                                  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(517,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                    tmpa            = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_89AB );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(523,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                    psLPC_Q14_tmp   = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(530,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                    tmpa            = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_CDEF );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(536,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                    psLPC_Q14_tmp   = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(820,24): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_nsq_del_dec_scale_states_sse4_1' that is compiled without support for 'sse4.1'
        xmm_x16_x2x0 = OP_CVTEPI16_EPI32_M64( &(x16[ i ] ) );
                       ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(825,24): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_nsq_del_dec_scale_states_sse4_1' that is compiled without support for 'sse4.1'
        xmm_x16_x2x0 = _mm_mul_epi32( xmm_x16_x2x0, xmm_inv_gain_Q26 );
                       ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(826,24): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_nsq_del_dec_scale_states_sse4_1' that is compiled without support for 'sse4.1'
        xmm_x16_x3x1 = _mm_mul_epi32( xmm_x16_x3x1, xmm_inv_gain_Q26 );
                       ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(831,24): error: '__builtin_ia32_pblendw128' needs target feature sse4.1
        xmm_x16_x2x0 = _mm_blend_epi16( xmm_x16_x2x0, xmm_x16_x3x1, 0xCC );
                       ^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\smmintrin.h(516,13): note: expanded from macro '_mm_blend_epi16'
  (__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
            ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
ninja: build stopped: subcommand failed.

After some investigation, I found the problem is that clang-cl also enables MSVC flag in cmake, so the checking support for SSE4.1 uses check_flag(SSE4_1 /arch:SSE2) # SSE2 and above (https://github.com/xiph/opus/blob/c9d5bea13e3cb7381bfa897a45d8bab4e7b767a7/cmake/OpusFunctions.cmake#L99), which is not good enough.

MSVC doesn't support to enable SSE4.1 seperately. It can only be enabled it by at least /arch:AVX, otherwise disabled. See https://docs.microsoft.com/en-us/cpp/build/reference/arch-x86?view=msvc-170. But clang supports more flexible options -msse4.1. So in this case, I'll recommend to use if (MSVC AND CMAKE_C_COMPILER_ID STREQUAL "MSVC") to decide whether use /arch:xx or -mxx.
I have tested it in my situation and it works well. Any idea? If it's ok, I can file a PR for this. Any idea?

One more question, from the compliation I saw the /arch:AVX or -mavx have been added to all files, which will make runtime detection no useful. Any reason to do that? Is it possible to move AVX optimization codes to seperate files so that we can only compile them with AVX, and don't run them when runtime detection found no AVX support?

Thanks!

wangyoucao577 avatar Jul 18 '22 07:07 wangyoucao577

Hi,

Clang-cl was never tested so not unexpected with bugs there.

I think you are correct regarding the file properties settings for MSVC is not set correctly, but need some more time refresh my memory. For runtime detection it should for sure only be applied on the corresponding files.

So in summary:

  • Add support for clang-cl Opus CMake
  • Revisit source file properties for MSVC and compiler flags.

Feel free to file PR, filing it on gitlab will make things easier to merge.

xnorpx avatar Jul 18 '22 21:07 xnorpx

So regarding SSE4.1 there is no auto-vectorization for MSVC, so there is no compiler flag needed for SSE4.1 source.

See (https://stackoverflow.com/a/64057905)

Regarding AVX in Opus there is no AVX optimizations currently, so either it's PRESUME_AVX which enabled it in all build or MAY_HAVE which is essentially no-op.

I am working on rewriting the intrinsic logic in cmake so will try to make more comments.

xnorpx avatar Jul 18 '22 21:07 xnorpx

I've sent email to [email protected] to apply for gitlab forking permission for several days, but didn't get response so far. Could you please take a look so that I can fork and create PR on gitlab? Thanks!

wangyoucao577 avatar Jul 21 '22 02:07 wangyoucao577

Hiya, saw the mail, just behind.

No need to ask permission to fork! Once you have a PR, I'll be happy to look.

xiphmont avatar Jul 21 '22 03:07 xiphmont

image

Sorry.. but the fork button is grey, so I can't fork it.

wangyoucao577 avatar Jul 21 '22 10:07 wangyoucao577

So regarding SSE4.1 there is no auto-vectorization for MSVC, so there is no compiler flag needed for SSE4.1 source.

See (https://stackoverflow.com/a/64057905)

Regarding AVX in Opus there is no AVX optimizations currently, so either it's PRESUME_AVX which enabled it in all build or MAY_HAVE which is essentially no-op.

I am working on rewriting the intrinsic logic in cmake so will try to make more comments.

Yes, MSVC doesn't have flags for SSE4.1, but MSVC compatible clang-cl does have.

wangyoucao577 avatar Jul 23 '22 02:07 wangyoucao577

https://github.com/xiph/opus/pull/257
I've made PR here since I still can't fork/branch on gitlab. Please help review. If it's good to you, I can move it to gitlab once I have permission to do. Thanks!

wangyoucao577 avatar Jul 23 '22 02:07 wangyoucao577

I'm also having compilations errors with opus + clang-cl. install-x64-windows-bitwig-rel-out.log

abique avatar Feb 13 '24 18:02 abique