arrow
arrow copied to clipboard
[C++] bit_util: Remove the pre-computed `kBitmask` table ?
Describe the enhancement requested
I see the pr removed the kBitMask in arrow-rs[1]. Besides, abseil mentions that this optimization is useless in Haswell machine [2][3], and this might also suffer from cache eviction. Should we trying to remove them and use something like above:
// Bitmask selecting the k-th bit in a byte
// static constexpr uint8_t kBitmask[] = {1, 2, 4, 8, 16, 32, 64, 128};
static constexpr uint8_t GetBitMask(uint8_t index) {
// DCHECK(index >= 0 && index <= 7);
return static_cast<uint8_t>(1) << index;
}
// the bitwise complement version of kBitmask
// static constexpr uint8_t kFlippedBitmask[] = {254, 253, 251, 247, 239, 223, 191, 127};
static constexpr uint8_t GetFlippedBitMask(uint8_t index) {
// DCHECK(index >= 0 && index <= 7);
return ~(static_cast<uint8_t>(1) << index);
}
[1] https://github.com/apache/arrow-rs/issues/5771 [2] https://abseil.io/fast/9 [3] https://abseil.io/fast/39
Component(s)
C++
cc @felipecrv @pitrou
Well, feel free to do the change and run any benchmarks.
It's a bit faster on My M1 MacOS with existing benchmark ( arrow-bit-util-benchmark ), I'll push first and runing ursa benchmark