glam-rs
glam-rs copied to clipboard
Boolean vector type aliases are a portability hazard (BVec3/BVec3A, and BVec4/BVec4A)
Sorry for the dramatic title - it's mostly an ergonomics issue. :-)
Currently, BVec4A
and BVec3A
are type aliases for BVec3
and BVec4
on platforms without explicit SIMD support (most prominently AArch64).
This means they can be used interchangeably on those platforms, but not on x86. Someone writing code on an AArch64 platform could assume that the code will also compile on x86.
Possible solutions:
- Define
BVec3/4
andBVec3A/4A
as separate types regardless of SIMD support. - Deprecate
BVec3
andBVec4
. Personally I'm struggling to see the use case for them (in contrast toVec3
), but I could be missing something.
Cheers!
Currently BVecN
contain N x bool
, so they are quite compact. On platforms with SIMD the SIMD masks are wrapped with BVec3A
and BVec4
. The idea here is the best of both worlds with and without SIMD. But yes, if you are on a platform where SIMD is not available this could be a portability issue if people reference the types directly.
In the past I had implemented these to contain N x u32
so they would be closer to the SIMD implementation, but I think it probably performs worse than just using bool
(although, I don't think I ever measured this). I could go back to that I guess, at least in the cases like AArch64 where there isn't a SIMD implementation yet, but it might perform worse.
Let me know if you need help benchmarking on AArch64. :-)
I just bumped into this problem, this is inconsistent to the other SIMD types and a consistent implementation should really be chosen.
In the past I had implemented these to contain N x u32 so they would be closer to the SIMD implementation, but I think it probably performs worse than just using bool (although, I don't think I ever measured this). I could go back to that I guess, at least in the cases like AArch64 where there isn't a SIMD implementation yet, but it might perform worse.
I compared the performance the "vec3 select" benchmark of BVec3
with bool
versus u32
and the latter was 13% slower. So removing the type aliases on platforms where there is no SIMD implementation will definitely have a performance penalty.
However, I do intend to provide implementations for stable SIMD architectures and architectures which are not yet stable can use the core simd implementation so hopefully in practice this perf penalty won't be a problem.