swift-collections icon indicating copy to clipboard operation
swift-collections copied to clipboard

[HashTreeCollections] Remove subtree count augmentation

Open lorentey opened this issue 3 years ago • 1 comments

Node references in persistent collections currently include the count of all items in the corresponding subtree. This makes index(_:offsetBy:) an operation with logarithmic complexity, but it wastes a bunch of memory on storing these counts. This is very unlikely to be a worthwhile trade off, as these data structures are unordered, and it seem improbable that someone would want to randomly jump around within them.

Remove the count augmentation, to boost memory performance and to make these prefix trees even more competitive in memory use vs the standard Set and Dictionary.

lorentey avatar Nov 02 '22 18:11 lorentey

Note that the CHAMP data structure, as discussed in the OOPSLA '15 paper, does not have sub-tree count augmentation. This can indeed be highly beneficial in terms of reducing memory footprints, depending on the memory alignment of the nodes. See Section 6 of the paper for a performance evaluation on the JVM platform.

For PersistentSet there would be a performance implication when performing structural set-algebra operation, e.g., when pruning or adding complete sub-trees. Without knowing the size of the sub-tree, one would have to iterate over the sub-tree (i.e., O(n) in terms of sub-tree size) to update the collections count accordingly.

In the end it's a trade-off between memory footprint and performance of some operations.

msteindorfer avatar Nov 03 '22 17:11 msteindorfer