opus
opus copied to clipboard
build failed when disable AVX on Windows with MSVC clang-cl compiler
Hi,
I'm trying to disable AVX
on windows compilation to compatible with old CPUs that no AVX
instructions support. I uses clang-cl
compiler(clang 12) that installed in latest Visual Studio 2022, but the build is failed. Here's the reproduction and error logs.
> mkdir -p build && cd build
> cmake .. -GNinja -DCMAKE_C_COMPILER=clang-cl -DOPUS_X86_MAY_HAVE_AVX=OFF
> ninja
[145/149] Building C object CMakeFiles\opus.dir\silk\x86\VQ_WMat_EC_sse4_1.c.obj
FAILED: CMakeFiles/opus.dir/silk/x86/VQ_WMat_EC_sse4_1.c.obj
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\bin\clang-cl.exe /nologo -DENABLE_HARDENING -DHAVE_CONFIG_H -DHAVE_LRINT -DHAVE_LRINTF -DOPUS_BUILD -DOPUS_HAVE_RTCD -DOPUS_X86_MAY_HAVE_SSE -DOPUS_X86_MAY_HAVE_SSE2 -DOPUS_X86_MAY_HAVE_SSE4_1 -DVAR_ARRAYS -D_CRT_SECURE_NO_WARNINGS -IE:\opus\include -IE:\opus\build -IE:\opus -IE:\opus\celt -IE:\opus\silk -IE:\opus\silk\float /DWIN32 /D_WINDOWS /W3 /MDd /Zi /Ob0 /Od /RTC1 /GS /showIncludes /FoCMakeFiles\opus.dir\silk\x86\VQ_WMat_EC_sse4_1.c.obj /FdCMakeFiles\opus.dir\opus.pdb -c -- E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c
E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c(88,26): error: always_inline function '_mm_cvtepi8_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_VQ_WMat_EC_sse4_1' that is compiled without support for 'sse4.1'
v_cb_row_31_Q7 = OP_CVTEPI8_EPI32_M32( &cb_row_Q7[ 1 ] );
^
E:\opus\celt/x86/x86cpu.h(68,3): note: expanded from macro 'OP_CVTEPI8_EPI32_M32'
(_mm_cvtepi8_epi32(_mm_cvtsi32_si128(OP_LOADU_EPI32(x))))
^
E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c(90,26): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_VQ_WMat_EC_sse4_1' that is compiled without support for 'sse4.1'
v_cb_row_31_Q7 = _mm_mul_epi32( v_XX_31_Q17, v_cb_row_31_Q7 );
^
E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c(91,26): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_VQ_WMat_EC_sse4_1' that is compiled without support for 'sse4.1'
v_cb_row_42_Q7 = _mm_mul_epi32( v_XX_42_Q17, v_cb_row_42_Q7 );
^
3 errors generated.
[147/149] Building C object CMakeFiles\opus.dir\silk\x86\NSQ_sse4_1.c.obj
FAILED: CMakeFiles/opus.dir/silk/x86/NSQ_sse4_1.c.obj
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\bin\clang-cl.exe /nologo -DENABLE_HARDENING -DHAVE_CONFIG_H -DHAVE_LRINT -DHAVE_LRINTF -DOPUS_BUILD -DOPUS_HAVE_RTCD -DOPUS_X86_MAY_HAVE_SSE -DOPUS_X86_MAY_HAVE_SSE2 -DOPUS_X86_MAY_HAVE_SSE4_1 -DVAR_ARRAYS -D_CRT_SECURE_NO_WARNINGS -IE:\opus\include -IE:\opus\build -IE:\opus -IE:\opus\celt -IE:\opus\silk -IE:\opus\silk\float /DWIN32 /D_WINDOWS /W3 /MDd /Zi /Ob0 /Od /RTC1 /GS /showIncludes /FoCMakeFiles\opus.dir\silk\x86\NSQ_sse4_1.c.obj /FdCMakeFiles\opus.dir\opus.pdb -c -- E:\opus\silk\x86\NSQ_sse4_1.c
E:\opus\silk\x86\NSQ_sse4_1.c(345,22): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
a_Q12_01234567 = _mm_shuffle_epi8( a_Q12_01234567, xmm_one );
^
E:\opus\silk\x86\NSQ_sse4_1.c(346,22): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
a_Q12_89ABCDEF = _mm_shuffle_epi8( a_Q12_89ABCDEF, xmm_one );
^
E:\opus\silk\x86\NSQ_sse4_1.c(357,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
xmm_tempa = _mm_shuffle_epi8( xmm_tempa, xmm_one );
^
E:\opus\silk\x86\NSQ_sse4_1.c(358,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
xmm_tempb = _mm_shuffle_epi8( xmm_tempb, xmm_one );
^
E:\opus\silk\x86\NSQ_sse4_1.c(366,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
xmm_tempa = _mm_shuffle_epi8( xmm_tempa, xmm_one );
^
E:\opus\silk\x86\NSQ_sse4_1.c(367,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
xmm_tempb = _mm_shuffle_epi8( xmm_tempb, xmm_one );
^
E:\opus\silk\x86\NSQ_sse4_1.c(376,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
xmm_tempa = _mm_shuffle_epi8( xmm_tempa, xmm_one );
^
E:\opus\silk\x86\NSQ_sse4_1.c(377,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
xmm_tempb = _mm_shuffle_epi8( xmm_tempb, xmm_one );
^
E:\opus\silk\x86\NSQ_sse4_1.c(394,33): error: '__builtin_ia32_palignr128' needs target feature ssse3
psLPC_Q14_hi_89ABCDEF = _mm_alignr_epi8( psLPC_Q14_hi_01234567, psLPC_Q14_hi_89ABCDEF, 2 );
^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\tmmintrin.h(148,12): note: expanded from macro '_mm_alignr_epi8'
(__m128i)__builtin_ia32_palignr128((__v16qi)(__m128i)(a), \
^
E:\opus\silk\x86\NSQ_sse4_1.c(395,33): error: '__builtin_ia32_palignr128' needs target feature ssse3
psLPC_Q14_lo_89ABCDEF = _mm_alignr_epi8( psLPC_Q14_lo_01234567, psLPC_Q14_lo_89ABCDEF, 2 );
^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\tmmintrin.h(148,12): note: expanded from macro '_mm_alignr_epi8'
(__m128i)__builtin_ia32_palignr128((__v16qi)(__m128i)(a), \
^
E:\opus\silk\x86\NSQ_sse4_1.c(442,30): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
b_Q14_3210 = OP_CVTEPI16_EPI32_M64( b_Q14 );
^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
(_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
^
E:\opus\silk\x86\NSQ_sse4_1.c(450,29): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
xmm_tempa = _mm_mul_epi32( xmm_tempa, b_Q14_3210 );
^
E:\opus\silk\x86\NSQ_sse4_1.c(455,37): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
pred_lag_ptr_0123 = _mm_mul_epi32( pred_lag_ptr_0123, b_Q14_0123 );
^
E:\opus\silk\x86\NSQ_sse4_1.c(623,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
xmm_xq_Q14_3210 = _mm_mul_epi32( xmm_xq_Q14_3210, xmm_Gain_Q10 );
^
E:\opus\silk\x86\NSQ_sse4_1.c(624,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
xmm_xq_Q14_x3x1 = _mm_mul_epi32( xmm_xq_Q14_x3x1, xmm_Gain_Q10 );
^
E:\opus\silk\x86\NSQ_sse4_1.c(625,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
xmm_xq_Q14_7654 = _mm_mul_epi32( xmm_xq_Q14_7654, xmm_Gain_Q10 );
^
E:\opus\silk\x86\NSQ_sse4_1.c(626,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
xmm_xq_Q14_x7x5 = _mm_mul_epi32( xmm_xq_Q14_x7x5, xmm_Gain_Q10 );
^
E:\opus\silk\x86\NSQ_sse4_1.c(633,31): error: '__builtin_ia32_pblendw128' needs target feature sse4.1
xmm_xq_Q14_3210 = _mm_blend_epi16( xmm_xq_Q14_3210, xmm_xq_Q14_x3x1, 0xCC );
^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\smmintrin.h(516,13): note: expanded from macro '_mm_blend_epi16'
(__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
^
E:\opus\silk\x86\NSQ_sse4_1.c(634,31): error: '__builtin_ia32_pblendw128' needs target feature sse4.1
xmm_xq_Q14_7654 = _mm_blend_epi16( xmm_xq_Q14_7654, xmm_xq_Q14_x7x5, 0xCC );
^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\smmintrin.h(516,13): note: expanded from macro '_mm_blend_epi16'
(__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
[148/149] Building C object CMakeFiles\opus.dir\silk\x86\NSQ_del_dec_sse4_1.c.obj
FAILED: CMakeFiles/opus.dir/silk/x86/NSQ_del_dec_sse4_1.c.obj
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\bin\clang-cl.exe /nologo -DENABLE_HARDENING -DHAVE_CONFIG_H -DHAVE_LRINT -DHAVE_LRINTF -DOPUS_BUILD -DOPUS_HAVE_RTCD -DOPUS_X86_MAY_HAVE_SSE -DOPUS_X86_MAY_HAVE_SSE2 -DOPUS_X86_MAY_HAVE_SSE4_1 -DVAR_ARRAYS -D_CRT_SECURE_NO_WARNINGS -IE:\opus\include -IE:\opus\build -IE:\opus -IE:\opus\celt -IE:\opus\silk -IE:\opus\silk\float /DWIN32 /D_WINDOWS /W3 /MDd /Zi /Ob0 /Od /RTC1 /GS /showIncludes /FoCMakeFiles\opus.dir\silk\x86\NSQ_del_dec_sse4_1.c.obj /FdCMakeFiles\opus.dir\opus.pdb -c -- E:\opus\silk\x86\NSQ_del_dec_sse4_1.c
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(409,18): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
a_Q12_0123 = OP_CVTEPI16_EPI32_M64( a_Q12 );
^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
(_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(410,18): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
a_Q12_4567 = OP_CVTEPI16_EPI32_M64( a_Q12 + 4 );
^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
(_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(413,22): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
a_Q12_89AB = OP_CVTEPI16_EPI32_M64( a_Q12 + 8 );
^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
(_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(414,22): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
a_Q12_CDEF = OP_CVTEPI16_EPI32_M64( a_Q12 + 12 );
^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
(_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(418,22): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
b_Q12_0123 = OP_CVTEPI16_EPI32_M64( b_Q14 );
^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
(_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(433,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
tmpa = _mm_mul_epi32( pred_lag_ptr_tmp, b_Q12_0123 );
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(437,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
pred_lag_ptr_tmp = _mm_mul_epi32( pred_lag_ptr_tmp, b_sr_Q12_0123 );
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(488,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
tmpa = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_0123 ); /* 0, -1, -2, -3 * 0123 -> 0*0, 2*-2 */
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(495,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
psLPC_Q14_tmp = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp ); /* 1*-1, 3*-3 */
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(502,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
tmpa = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_4567 );
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(508,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
psLPC_Q14_tmp = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp );
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(517,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
tmpa = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_89AB );
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(523,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
psLPC_Q14_tmp = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp );
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(530,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
tmpa = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_CDEF );
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(536,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
psLPC_Q14_tmp = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp );
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(820,24): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_nsq_del_dec_scale_states_sse4_1' that is compiled without support for 'sse4.1'
xmm_x16_x2x0 = OP_CVTEPI16_EPI32_M64( &(x16[ i ] ) );
^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
(_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(825,24): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_nsq_del_dec_scale_states_sse4_1' that is compiled without support for 'sse4.1'
xmm_x16_x2x0 = _mm_mul_epi32( xmm_x16_x2x0, xmm_inv_gain_Q26 );
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(826,24): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_nsq_del_dec_scale_states_sse4_1' that is compiled without support for 'sse4.1'
xmm_x16_x3x1 = _mm_mul_epi32( xmm_x16_x3x1, xmm_inv_gain_Q26 );
^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(831,24): error: '__builtin_ia32_pblendw128' needs target feature sse4.1
xmm_x16_x2x0 = _mm_blend_epi16( xmm_x16_x2x0, xmm_x16_x3x1, 0xCC );
^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\smmintrin.h(516,13): note: expanded from macro '_mm_blend_epi16'
(__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
ninja: build stopped: subcommand failed.
After some investigation, I found the problem is that clang-cl
also enables MSVC
flag in cmake, so the checking support for SSE4.1 uses check_flag(SSE4_1 /arch:SSE2) # SSE2 and above
(https://github.com/xiph/opus/blob/c9d5bea13e3cb7381bfa897a45d8bab4e7b767a7/cmake/OpusFunctions.cmake#L99), which is not good enough.
MSVC doesn't support to enable SSE4.1 seperately. It can only be enabled it by at least /arch:AVX
, otherwise disabled. See https://docs.microsoft.com/en-us/cpp/build/reference/arch-x86?view=msvc-170. But clang supports more flexible options -msse4.1
. So in this case, I'll recommend to use if (MSVC AND CMAKE_C_COMPILER_ID STREQUAL "MSVC")
to decide whether use /arch:xx
or -mxx
.
I have tested it in my situation and it works well. Any idea? If it's ok, I can file a PR for this. Any idea?
One more question, from the compliation I saw the /arch:AVX
or -mavx
have been added to all files, which will make runtime detection no useful. Any reason to do that? Is it possible to move AVX
optimization codes to seperate files so that we can only compile them with AVX
, and don't run them when runtime detection found no AVX
support?
Thanks!
Hi,
Clang-cl was never tested so not unexpected with bugs there.
I think you are correct regarding the file properties settings for MSVC is not set correctly, but need some more time refresh my memory. For runtime detection it should for sure only be applied on the corresponding files.
So in summary:
- Add support for clang-cl Opus CMake
- Revisit source file properties for MSVC and compiler flags.
Feel free to file PR, filing it on gitlab will make things easier to merge.
So regarding SSE4.1 there is no auto-vectorization for MSVC, so there is no compiler flag needed for SSE4.1 source.
See (https://stackoverflow.com/a/64057905)
Regarding AVX in Opus there is no AVX optimizations currently, so either it's PRESUME_AVX which enabled it in all build or MAY_HAVE which is essentially no-op.
I am working on rewriting the intrinsic logic in cmake so will try to make more comments.
I've sent email to [email protected] to apply for gitlab forking permission for several days, but didn't get response so far. Could you please take a look so that I can fork and create PR on gitlab? Thanks!
Hiya, saw the mail, just behind.
No need to ask permission to fork! Once you have a PR, I'll be happy to look.
Sorry.. but the fork
button is grey, so I can't fork it.
So regarding SSE4.1 there is no auto-vectorization for MSVC, so there is no compiler flag needed for SSE4.1 source.
See (https://stackoverflow.com/a/64057905)
Regarding AVX in Opus there is no AVX optimizations currently, so either it's PRESUME_AVX which enabled it in all build or MAY_HAVE which is essentially no-op.
I am working on rewriting the intrinsic logic in cmake so will try to make more comments.
Yes, MSVC doesn't have flags for SSE4.1, but MSVC compatible clang-cl does have.
https://github.com/xiph/opus/pull/257
I've made PR here since I still can't fork/branch on gitlab. Please help review. If it's good to you, I can move it to gitlab once I have permission to do. Thanks!
I'm also having compilations errors with opus + clang-cl. install-x64-windows-bitwig-rel-out.log