connect
connect copied to clipboard
DRAFT Add probabilistic in-memory caches bloom and cuckoo for dedup/cached with sharding
I add support to two probabilistic filters (bloom and cuckoo) that can be used as cache on dedupe processor
optionally, such caches can dump and restore the state from filesystem. useful to avoid start the daemon without any data
Motivation: I need to perform a dedupe on a huge amount of data, and this pull request may save me memory / resources
this MR can be combined with https://github.com/benthosdev/benthos/pull/2123 in the future
@mihaitodor do you think it is ok now ? @Jeffail what do you think?