stdarch icon indicating copy to clipboard operation
stdarch copied to clipboard

Missing x86 vendor intrinsics (SSE2, SSE 4.1, AVX2)

Open newpavlov opened this issue 4 years ago • 8 comments

Previous issue: #40

AVX2

MMX

EDIT(@workingjubilee): Direct MMX support is no longer in scope for std::arch, see:

  • https://github.com/rust-lang/stdarch/pull/890

SSE

SSE2

SSE4.1

Personally I am interested only in _mm_stream_load_si128 and _mm256_stream_load_si256, but I think it's worth to properly track all unimplemented intrinsics. Some of those intrinsics (e.g. _mm_malloc and _mm_free) probably should not be exposed, but, in my opinion, motivation behind such decision should be explicitly recorded somewhere (ideally in comments of relevant source files).

newpavlov avatar Jun 07 '21 05:06 newpavlov

I was under the impression that we'd deliberately removed the MMX stuff.

Lokathor avatar Jun 07 '21 05:06 Lokathor

Indeed. It requires special handling in the compiler to emit the right type for MMX vectors as they are a different type from regular vectors. In addition it is pretty much impossible to use correctly as LLVM can reorder MMX usage before the intrinsic that enables MMX.

bjorn3 avatar Jun 07 '21 05:06 bjorn3

What about the streaming load intrinsics? Is there a reason why they have been omitted?

newpavlov avatar Jun 07 '21 07:06 newpavlov

Some of the streaming ops are already in, and stabilized (eg: _mm_stream_pd).

Given this, I'd guess that any missing streaming ops are likely an oversight (at least for 128 or 256 bit).

Lokathor avatar Jun 07 '21 07:06 Lokathor

_mm_broadcastsi128_si256 seems to be an alias for _mm256_broadcastsi128_si256 which is implemented. The intrinsics guide lists both as translating to the same instruction and with the same description.

jhorstmann avatar Nov 05 '21 14:11 jhorstmann

_mm_malloc and _mm_free seem like they require implementing in libstd.

workingjubilee avatar Jun 25 '24 06:06 workingjubilee

note that there are issues with these streaming intrinsics, as they have nontemporal hints that are not properly modelled. : https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Non-temporal.20stores

Noratrieb avatar Jun 26 '24 07:06 Noratrieb

They've been converted into assembly.

workingjubilee avatar Jun 26 '24 07:06 workingjubilee