Missing x86 vendor intrinsics (SSE2, SSE 4.1, AVX2)
Previous issue: #40
AVX2
MMX
EDIT(@workingjubilee): Direct MMX support is no longer in scope for std::arch, see:
- https://github.com/rust-lang/stdarch/pull/890
SSE
- [ ]
_mm_free - [ ]
_mm_storeu_si16 - [ ]
_mm_loadu_si16 - [ ]
_mm_malloc - [ ]
_mm_storeu_si64
SSE2
- [ ]
_mm_loadu_si32 - [ ]
_mm_storeu_si32
SSE4.1
Personally I am interested only in _mm_stream_load_si128 and _mm256_stream_load_si256, but I think it's worth to properly track all unimplemented intrinsics. Some of those intrinsics (e.g. _mm_malloc and _mm_free) probably should not be exposed, but, in my opinion, motivation behind such decision should be explicitly recorded somewhere (ideally in comments of relevant source files).
I was under the impression that we'd deliberately removed the MMX stuff.
Indeed. It requires special handling in the compiler to emit the right type for MMX vectors as they are a different type from regular vectors. In addition it is pretty much impossible to use correctly as LLVM can reorder MMX usage before the intrinsic that enables MMX.
What about the streaming load intrinsics? Is there a reason why they have been omitted?
Some of the streaming ops are already in, and stabilized (eg: _mm_stream_pd).
Given this, I'd guess that any missing streaming ops are likely an oversight (at least for 128 or 256 bit).
_mm_broadcastsi128_si256 seems to be an alias for _mm256_broadcastsi128_si256 which is implemented. The intrinsics guide lists both as translating to the same instruction and with the same description.
_mm_malloc and _mm_free seem like they require implementing in libstd.
note that there are issues with these streaming intrinsics, as they have nontemporal hints that are not properly modelled. : https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Non-temporal.20stores
They've been converted into assembly.