stdarch icon indicating copy to clipboard operation
stdarch copied to clipboard

Define unaligned vector types in stdarch

Open folkertdev opened this issue 5 months ago • 5 comments

Some C compilers (clang and gcc at least) define unaligned vector types. Vectors are often read from unaligned locations, and then using and dereferencing a pointer to the normal vector types would be UB. Hence these types:

// emmintrin.h
typedef double __m128d_u __attribute__((__vector_size__(16), __aligned__(1)));
typedef long long __m128i_u __attribute__((__vector_size__(16), __aligned__(1)));

// xmmintrin.h
typedef float __m128_u __attribute__((__vector_size__(16), __aligned__(1)));

// avxintrin.h
typedef float __m256_u __attribute__ ((__vector_size__ (32), __aligned__(1)));
typedef double __m256d_u __attribute__((__vector_size__(32), __aligned__(1)));
typedef long long __m256i_u __attribute__((__vector_size__(32), __aligned__(1)));

// avx512intrin.h
typedef float __m512_u __attribute__((__vector_size__(64), __aligned__(1)));
typedef double __m512d_u __attribute__((__vector_size__(64), __aligned__(1)));
typedef long long __m512i_u __attribute__((__vector_size__(64), __aligned__(1)));

They have the size of the standard vector types, but an alignment of 1. These types are useful for translating C to rust. e.g. https://github.com/immunant/c2rust/pull/1132.

Proposal

Define these types as one of these

#[repr(C)]
struct([u8; 16]);

#[repr(C)]
struct __m128d_u(u8, u8, u8, ..., u8);

#[repr(C, packed(1))]
struct __m128d_u(f64, f64);

I think the first option runs into arrays technically not being FFI-safe. Option 2 is kind of cumbersome, so I'd like to explore option 3 using packed. The packed option can be a little weird, but I don't think that matters here. The whole purpose of this type is just to load the value, and then it'll likely just get transmuted to a proper vector type.


I can open this as an ACP if that is the proper route.

folkertdev avatar Jun 27 '25 20:06 folkertdev

Does #[repr(simd, packed(1))] struct __m128d_u([f64; 2]) work? And is that compatible with the calling convention that Clang and GCC use?

bjorn3 avatar Jun 27 '25 20:06 bjorn3

Not sure, how would I test that?

I know that the plan is to eventually remove repr(simd), that's why I left it out above.

folkertdev avatar Jun 27 '25 20:06 folkertdev

based on

https://godbolt.org/z/W4Y1rrfc4

#![feature(repr_simd)]

#[derive(Clone, Copy)]
#[repr(simd, packed(1))]
struct __m128d_u([f64; 2]);

#[unsafe(no_mangle)]
 extern "C" fn id(x: &__m128d_u) -> __m128d_u {
    *x
}

It does not, In fact, it looks like packed has no effect whatsoever and the load is still aligned to 16:

define <2 x double> @id(ptr noalias nocapture noundef readonly align 16 dereferenceable(16) %x) unnamed_addr {
start:
  %_0 = load <2 x double>, ptr %x, align 16
  ret <2 x double> %_0
}

The equivalent clang code does use an alignment of 1: https://godbolt.org/z/6MzTP3c6e

folkertdev avatar Jun 29 '25 18:06 folkertdev

afaik currently repr(simd) doesn't interact at all with other reprs, including packed and align. This might need some modifications in rustc

sayantn avatar Jun 29 '25 18:06 sayantn

I checked with Jubilee and packed getting ignored is deliberate. So in effect I guess it's not possible today to have a type that is passed like a vector, but loaded/stored with an alignment of 1.

folkertdev avatar Jun 29 '25 19:06 folkertdev