cuprate icon indicating copy to clipboard operation
cuprate copied to clipboard

Usage of filter data structures over sets

Open hinto-janai opened this issue 4 months ago • 1 comments

What

Filter data structures are smaller & faster than traditional sets.

Cuprate could use these over traditional sets (HashSet, BTreeSet) in cases where all we need to know is:

  • Does the set not contain x?

Off the top of my head, key image & peerlist checks could benefit from this.

These could also be used for a filter layer above the database to avoid disk access.

Filter usage in other projects:

  • https://github.com/search?q=repo%3Abitcoin%2Fbitcoin%20bloom&type=code
  • https://github.com/search?q=repo%3Aparadigmxyz%2Freth+bloom&type=code
  • https://github.com/search?q=repo%3AZcashFoundation%2Fzebra%20bloom&type=code

These are essentially optimizations (which also come with negative side-effects, e.g. removal of a set member is more complicated in filters), so it isn't necessary right now, although we should eventually get around to it.

Filters

Some possible filters we could use. Starting from the most known with the longest history.

Filter Description
Bloom https://en.wikipedia.org/wiki/Bloom_filter
Cuckoo https://en.wikipedia.org/wiki/Cuckoo_filter
Xor https://arxiv.org/abs/1912.08258
Ribbon https://arxiv.org/abs/2103.02515

hinto-janai avatar Feb 23 '24 16:02 hinto-janai