graphene
graphene copied to clipboard
Explore the idea of peeking SIMD data
trafficstars
It might be possible to return a pointer to the start of a graphene_simd4_t (or graphene_simd4x4f_t) to functions that expect an array of floating point values; this would allow removing a stack allocation when all we care about is passing a bunch of floats to, say, GL.
Experiments on x86_64 seem to yield positive results, but it could be a combination of recent compilers and specific SIMD types, so this would require further investigation:
- does passing the reference to an
__m128or afloat32x4type actually lead to a SIMD register read? - if the read happens, is it dependent on the OS?
- if the read happens, is it dependent on the type or version of the compiler?