rustc_codegen_cranelift icon indicating copy to clipboard operation
rustc_codegen_cranelift copied to clipboard

missing avx2 intrinsics

Open lu-zero opened this issue 1 year ago • 9 comments

Current rav1e master shows:

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.pabs.w; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.pmadd.wd; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.psign.w; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.psrlv.d.256; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.packssdw; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.gather.d.d.256; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx.cvtdq2.ps.256; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.psllv.d.256; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.psrav.d.256; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.sse2.packssdw.128; replacing with trap

lu-zero avatar Nov 08 '23 10:11 lu-zero

Some of these (but not all) should have been implemented by https://github.com/rust-lang/rustc_codegen_cranelift/pull/1417 which hasn't landed on rustc nightly yet.

bjorn3 avatar Nov 08 '23 11:11 bjorn3

Yes :) I tested with the current master and the list is shorter:

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.pabs.w; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.psign.w; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.psrlv.d.256; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.psllv.d.256; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.psrav.d.256; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx.cvtdq2.ps.256; replacing with trap

warning: unsupported x86 llvm intrinsic llvm.x86.avx2.gather.d.d.256; replacing with trap

lu-zero avatar Nov 09 '23 07:11 lu-zero

The gather intrinsics are now implemented on the even_more_simd_intrinsics branch.

bjorn3 avatar Nov 24 '23 19:11 bjorn3

I'm getting warning unsupported x86 llvm intrinsic llvm.x86.avx2.psllv.d.256; replacing with trap and invocation of image-rs crate (avif encoder) results in crash:

trap at Instance { def: Item(DefId(2:15232 ~ core[a9f5]::core_arch::x86::avx2::_mm256_sllv_epi32)), args: [] } (_ZN4core9core_arch3x864avx217_mm256_sllv_epi3217h96c8b42e4aa5b850E): llvm.x86.avx2.psllv.d.256

MrFoxPro avatar Mar 14 '24 12:03 MrFoxPro

I don't see permd listed above, but this is the same issue right?

trap at Instance { def: Item(DefId(1:15384 ~ core[212b]::core_arch::x86::avx2::_mm256_permutevar8x32_epi32)), args: [] } (_ZN4core9core_arch3x864avx227_mm256_permutevar8x32_epi3217h294fc7d72ae002f8E): llvm.x86.avx2.permd

Would be nice if there was a codegen_backend cfg so I could conditionally ignore the test that triggers this.

svix-jplatte avatar Apr 25 '24 10:04 svix-jplatte

I just added permd in https://github.com/rust-lang/rustc_codegen_cranelift/pull/1491 (cc @svix-jplatte)

that PR is also a good template for if someone wants to add other avx2 intrinsics. Cranelift does not support values that wide (currently, not sure if it ever will) so the implementation will have to kind of simulate what the instruction does.

folkertdev avatar May 11 '24 21:05 folkertdev

For that to become available via rustup, this repo needs to be synced into https://github.com/rust-lang/rust again, right?

svix-jplatte avatar May 13 '24 08:05 svix-jplatte

Correct, I can do that later today.

bjorn3 avatar May 13 '24 08:05 bjorn3

Thanks, yesterday's nightly has support! I'm now also running into a llvm.x86.avx2.psllv.d.256 trap like MrFoxPro with the other one fixed :smile:

svix-jplatte avatar May 15 '24 09:05 svix-jplatte