glam-rs icon indicating copy to clipboard operation
glam-rs copied to clipboard

Boolean vector type aliases are a portability hazard (BVec3/BVec3A, and BVec4/BVec4A)

Open simonask opened this issue 2 years ago • 3 comments

Sorry for the dramatic title - it's mostly an ergonomics issue. :-)

Currently, BVec4A and BVec3A are type aliases for BVec3 and BVec4 on platforms without explicit SIMD support (most prominently AArch64).

This means they can be used interchangeably on those platforms, but not on x86. Someone writing code on an AArch64 platform could assume that the code will also compile on x86.

Possible solutions:

  • Define BVec3/4 and BVec3A/4A as separate types regardless of SIMD support.
  • Deprecate BVec3 and BVec4. Personally I'm struggling to see the use case for them (in contrast to Vec3), but I could be missing something.

Cheers!

simonask avatar Jun 24 '22 15:06 simonask

Currently BVecN contain N x bool, so they are quite compact. On platforms with SIMD the SIMD masks are wrapped with BVec3A and BVec4. The idea here is the best of both worlds with and without SIMD. But yes, if you are on a platform where SIMD is not available this could be a portability issue if people reference the types directly.

In the past I had implemented these to contain N x u32 so they would be closer to the SIMD implementation, but I think it probably performs worse than just using bool (although, I don't think I ever measured this). I could go back to that I guess, at least in the cases like AArch64 where there isn't a SIMD implementation yet, but it might perform worse.

bitshifter avatar Jun 25 '22 01:06 bitshifter

Let me know if you need help benchmarking on AArch64. :-)

simonask avatar Jun 27 '22 14:06 simonask

I just bumped into this problem, this is inconsistent to the other SIMD types and a consistent implementation should really be chosen.

makspll avatar Jul 05 '22 11:07 makspll

In the past I had implemented these to contain N x u32 so they would be closer to the SIMD implementation, but I think it probably performs worse than just using bool (although, I don't think I ever measured this). I could go back to that I guess, at least in the cases like AArch64 where there isn't a SIMD implementation yet, but it might perform worse.

I compared the performance the "vec3 select" benchmark of BVec3 with bool versus u32 and the latter was 13% slower. So removing the type aliases on platforms where there is no SIMD implementation will definitely have a performance penalty.

However, I do intend to provide implementations for stable SIMD architectures and architectures which are not yet stable can use the core simd implementation so hopefully in practice this perf penalty won't be a problem.

bitshifter avatar Aug 27 '22 05:08 bitshifter