clkhash
clkhash copied to clipboard
Remove dependency on Bitarray
I propose removing the dependency on Bitarray and using bitwise operations on ints instead.
I see no good reason to use bitarray. The only two operations we use on it are:
- setting a particular bit to True, which can be done with bitwise OR; and
- converting to bytes, which is also provided by
int. Removing Bitarray would mean one fewer dependency!
In terms of speed, the two approaches are comparable:
>>> from bitarray import bitarray
>>> from functools import partial
>>> from hashlib import sha256
>>> from timeit import timeit
>>>
>>> def current(n, k):
... ba = bitarray(n)
... ba.setall(False)
... for i in range(k):
... l = int.from_bytes(sha256(str(i).encode('ascii')).digest(), 'big')
... ba[l % n] = True
... return ba.tobytes()
...
>>> def alternative(n, k):
... c = 0
... one = 1 << n - 1
... for i in range(k):
... l = int.from_bytes(sha256(str(i).encode('ascii')).digest(), 'big')
... c |= one >> l % n
... return c.to_bytes((n + 7) // 8, 'big')
...
>>> assert current(1024, 600) == alternative(1024, 600)
>>> timeit(partial(current, 1024, 600), number=10000)
12.497227690997534
>>> timeit(partial(alternative, 1024, 600), number=10000)
11.94743290200131
I would like to discuss this proposal before it is implemented. (@hardbyte @wilko77 Any thoughts?)
Aha! Link: https://csiro.aha.io/features/ANONLINK-29
If anyone feels inclined to do this I'd merge it in - provided that in doing so we didn't break support for Python 2.
The only issue I've had with bitarray is they don't upload binary wheels to PyPi meaning users have to have a C compiler to install it. I've open an upstream PR to build cross platform wheels of bitarray