simde icon indicating copy to clipboard operation
simde copied to clipboard

_mm_malloc and _mm_free (feature request)

Open jeffhammond opened this issue 2 years ago • 6 comments

Many codes that use SSE and friends also use _mm_{malloc,realloc,free}.

SSE2Neon supports these (https://github.com/DLTcollab/sse2neon/pull/25/files).

The implementation is simple and I will try to contribute if no one else does it first.

Thanks.

jeffhammond avatar Feb 18 '22 09:02 jeffhammond

I was wondering whether there was any progress on this issue. I am trying to port a package from Intel, which can compile with with either SSE, AVX2, or AVX512 instructions, depending on the machine, to Arm. Can I simply replace _mm_malloc() with posix_memalign()?

thomasdwu avatar Feb 12 '23 17:02 thomasdwu

Yes. https://stackoverflow.com/q/32612881/2189128 has some context.

jeffhammond avatar Feb 12 '23 18:02 jeffhammond

You'll have to replace _mm_free as well, but just with free().

jeffhammond avatar Feb 12 '23 18:02 jeffhammond

Got it, thanks. I also see that posix_memalign() takes its argument in terms of bits, whereas _mm_malloc() takes it in terms of bytes.

thomasdwu avatar Feb 12 '23 18:02 thomasdwu

The alignment argument is in bytes from what I can tell.

The value of alignment shall be a power of two multiple of sizeof(void *).

https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_memalign.html

jeffhammond avatar Feb 12 '23 20:02 jeffhammond

Yes, you're right. Somehow, I was mistakenly thinking that addresses were in terms of bits.

thomasdwu avatar Feb 12 '23 21:02 thomasdwu