pyxorfilter
pyxorfilter copied to clipboard
Python bindings for xorfilter(faster and smaller than bloom and cuckoo filters)
pyxorfilter
Python bindings for C implementation of Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters
Installation
pip install pyxorfilter
From Source
git clone --recurse-submodules https://github.com/glitzflitz/pyxorfilter
cd pyxorfilter
python setup.py build_ext
python setup.py install
Usage
>>> from pyxorfilter import Xor8, Xor16
>>> filter = Xor8(5) #or Xor16(size)
>>> #Supports unicode strings and heterogeneous types
>>> test_str = ["あ","अ", 51, 0.0, 12.3]
>>> filter.populate(test_str)
True
>>> filter.contains("अ")
True
>>> filter[51] #You can use __getitem__ instead of contains
True
>>> filter["か"]
False
>>> filter.contains(150)
False
>>> filter.size_in_bytes()
60
Caveats
Accuracy
For more accuracy(less false positives) use larger but more accurate Xor16.
Overflow
Both Xor8 and Xor16 take uint8_t and uint_16t respectively. Make sure that the input is unsigned.
TODO
- [x] Add unit tests
- [x] Add CI support for distributing pyxorfilter with PyPI.
Links
- C Implementation
- Go Implementation
- Erlang bindings
- Rust Implementation: 1 and 2
- C++ Implementation
- Java Implementation