Henry de Valence

Results 229 comments of Henry de Valence

Cool! I'll give it a try some time this week, RustConf permitting.

Should AVX-512 intrinsics be split into modules corresponding to their feature flag? This seems sensible except that I'm not sure how it should interact with the AVX512VL extension, since it...

Hmm, but the VL flag is orthogonal to the other flags, so for instance the `_mm256_madd52hi_epu64` intrinsic requires IFMA *and* VL. Where should it live?

> avx-512 is complicated :/ no kidding... looking at the AVX-512 Venn diagram: ![image](https://user-images.githubusercontent.com/44879/45388477-e8980480-b5cd-11e8-90a5-a5afbf724b72.png) it seems like the only CPUs that don't have VL extensions for all of their supported...

I started working on this in this branch: https://github.com/hdevalence/stdsimd/tree/avx512 (very rough work). I'm not sure how to encode the masks. Considering `_mm512_abs_epi32` as an example, there's masked versions `_mm512_mask_abs_epi32` and...

Update, I found the `simd_select` intrinsic, which seems like the answer to my last question. After some more digging, I realized that the allintrinsics gist seems out of date, and...

Oops, I didn't refresh the page before I posted that, sorry for the confusion.

I updated that branch with definitions of `set[r]_epi[8,16,32,64]` and `_mm512_add_epi[8,16,32,64]`, next I'll try adding `mask`, `maskz`, and VL variants.

>> For the mask type, Clang effectively passes it around as a u16 (or however many mask bit are needed) and bitcasts that to the matching vector when doing passing...

If anyone has time to implement such an intrinsic (I don't know how to do it myself), I'd like to start adding some AVX512 intrinsics.