STL icon indicating copy to clipboard operation
STL copied to clipboard

`<bit>`: Use `_CountTrailingZeros[64]` for ARM64

Open StephanTLavavej opened this issue 2 years ago • 2 comments
trafficstars

In VS 2022 17.7 Preview 3 (internal MSVC-PR-469248), our compiler back-end dev Jack Buchanan implemented new intrinsics in <intrin0.inl.h>:

__MACHINEARM_ARM64(unsigned int _CountTrailingZeros(unsigned long))
__MACHINEARM_ARM64(unsigned int _CountTrailingZeros64(unsigned __int64))

We should take advantage of them in <bit>'s countr_zero(), actually implemented in <limits> by _Countr_zero():

https://github.com/microsoft/STL/blob/5404ba9c25f26f25a0ac50e6c4defce7833a8da6/stl/inc/limits#L1208-L1219

Similar to how we use _CountLeadingZeros[64] for countl_zero():

https://github.com/microsoft/STL/blob/5404ba9c25f26f25a0ac50e6c4defce7833a8da6/stl/inc/bit#L285-L306

StephanTLavavej avatar May 10 '23 19:05 StephanTLavavej

Not sure if this will be in 17.7 Preview 3 or 17.8 Preview 1 - we'll need to check.

StephanTLavavej avatar Jun 16 '23 18:06 StephanTLavavej

Looks like we might need to ask Clang to implement these intrinsics, similar to #1586.

Updating this issue to no longer mention ARM32; at this time we still need to keep it compiling and working, but we no longer care about optimizing for it. Only ARM64 performance matters.

StephanTLavavej avatar Feb 21 '24 22:02 StephanTLavavej