clifford
clifford copied to clipboard
canonical_reordering_sign_euclidean() could be made faster
It looks to me like canonical_reordering_sign_euclidean()
's running time is probably quadratic in the highest index of any bit set in bitmap_a
(or maybe linear, if count_bits_set()
is fast, i.e. if not _utils.DISABLE_JIT
).
I think this could be improved to be at most linear in the total number of set bits in the two input bitmasks (often a very small number like 1 or 2, I'm guessing), and in fact even smaller than that-- it can be made linear in the number of bits that end up participating in swaps.
(To be precise: it needs to iterate over only the bits that are set in the bitwise-OR (bitmask_a|bitmask_b) whose indices are strictly higher than the lowest bit set in bitmask_b, and strictly lower than the highest bit set in bitmask_a).
I haven't looked at the call graph very closely, nor have I done any benchmarks, but I suspect this function is called heavily during multiplications of multivectors, so it might be worth optimizing.
For reference, the current implementation is this:
def canonical_reordering_sign_euclidean(bitmap_a, bitmap_b):
"""
Computes the sign for the product of bitmap_a and bitmap_b
assuming a euclidean metric
"""
a = bitmap_a >> 1
sum_value = 0
while a != 0:
sum_value = sum_value + count_set_bits(a & bitmap_b)
a = a >> 1
if (sum_value & 1) == 0:
return 1
else:
return -1
I think the following rewrite would work (sorry, I haven't tested it; I haven't learned how to build and test yet):
def canonical_reordering_sign_euclidean(bitmap_a, bitmap_b):
"""
Computes the sign for the product of bitmap_a and bitmap_b
assuming a euclidean metric
"""
# Bits in bitmap_a at or lower than the lowest bit in bitmap_b don't matter,
# so we start by zeroing them out to avoid wasting loop iterations on them.
# (Note that if bitmap_b is 0, this just zeros out bitmap_a too, which is appropriate.)
bitmap_a &= ~(((bitmap_b & -bitmap_b)<<1) - 1)
sign = 1 # (-1)**(num needed swaps seen so far)
bsign = 1 # (-1)**(num set bits seen in bitmap_b so far)
remaining_union = bitmap_a | bitmap_b
while (remaining_union & bitmap_a) != 0:
lowest_remaining_bit = remaining_union & -remaining_union
if (bitmap_a & lowest_remaining_bit) != 0:
# Let i be the index of lowest_remaining_bit.
# The number of set indices strictly lower than i in bitmap_b
# has parity bsign, and the i in bitmap_a needs to move to the right
# past each of them.
sign *= bsign
if (bitmap_b & lowest_remaining_bit) != 0:
bsign *= -1
remaining_union &= ~lowest_remaining_bit
return sign
Also, I think the _utils.DISABLE_JIT
implementation of the helper function count_set_bits()
could be sped up pretty simply too (it's no longer needed in the above implementation of canonical_reordering_sign_euclidean()
, but it's called by other things).
That's currently this:
def count_set_bits(bitmap: int) -> int:
""" Counts the number of bits set to 1 in bitmap """
count = 0
for i in set_bit_indices(bitmap):
count += 1
return count
which could be rewritten as this:
def count_set_bits(bitmap: int) -> int:
""" Counts the number of bits set to 1 in bitmap """
# Kernighan's bit counting algorithm
count = 0
while bitmap != 0:
bitmap &= bitmap-1 # turn off lowest bit
count += 1
return count