pyxorfilter icon indicating copy to clipboard operation
pyxorfilter copied to clipboard

Python bindings for xorfilter(faster and smaller than bloom and cuckoo filters)

pyxorfilter

Build Status

Python bindings for C implementation of Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters

Installation

pip install pyxorfilter

From Source

git clone --recurse-submodules https://github.com/glitzflitz/pyxorfilter
cd pyxorfilter
python setup.py build_ext
python setup.py install

Usage

>>> from pyxorfilter import Xor8, Xor16
>>> filter = Xor8(5)	#or Xor16(size)
>>> #Supports unicode strings and heterogeneous types
>>> test_str = ["あ","अ", 51, 0.0, 12.3]
>>> filter.populate(test_str)
True
>>> filter.contains("अ")
True
>>> filter[51]  #You can use __getitem__ instead of contains
True
>>> filter["か"]
False
>>> filter.contains(150)
False
>>> filter.size_in_bytes()
60

Caveats

Accuracy

For more accuracy(less false positives) use larger but more accurate Xor16.

Overflow

Both Xor8 and Xor16 take uint8_t and uint_16t respectively. Make sure that the input is unsigned.

TODO

  • [x] Add unit tests
  • [x] Add CI support for distributing pyxorfilter with PyPI.

Links