pybloomfiltermmap3 icon indicating copy to clipboard operation
pybloomfiltermmap3 copied to clipboard

is there a multithreads method for add value?

Open LzyloveRila opened this issue 2 years ago • 2 comments

when add 1 billion values to the bloom filters, it cost nearly 4 hours on my server, I think it's possible to calculate hashes for batches of values in multithreads.

LzyloveRila avatar Apr 20 '23 07:04 LzyloveRila

My strategy was to create a lot of filters in parallel (lets say you have billion values; then you can create 1000 filters adding values to them in parallel) and then merge them using the .union() function.

mireklzicar avatar May 17 '23 16:05 mireklzicar

Multithreading won't necessarily help as adding/hashing is a CPU bound operation. @mireklzicar's approach is preferable.

prashnts avatar Jun 14 '23 08:06 prashnts