swift-collections
swift-collections copied to clipboard
Counted Set
How about a counted set/multiset type? These are pretty commonly used, and can be quite handy:
https://developer.apple.com/documentation/foundation/nscountedset https://developer.apple.com/documentation/corefoundation/cfbag-s1l
Yep -- counted multisets are useful, and they would be an interesting API design exercise!
I think an unordered counted multiset that stores its contents in a Dictionary
might be a relatively straightforward addition. (I wonder if they can be implemented better than just a dictionary mapping set elements to their multiplicities -- perhaps there is some optimization opportunity there, too.)
An ordered multiset would likely not be counting -- it would need to preserve duplicate items so that it can keep them in the desired order. This would be much like OrderedSet
except without the uniqueness requirement. (I.e., the benefit over Array would be that it provided a fast way to find all instances of particular items.) Unfortunately, our existing hash table implementations can't handle duplicate keys well, so we'd need to switch away from open addressing with linear probing. (Which isn't necessarily a big deal, but it does lead to concerns about memory allocation overhead & general performance.)
We already have plans to use the B-tree implementation that we develop in #1 to add a sorted multiset type. This too would keep duplicates as is, rather than counting them.
Part of the API design challenge is to figure out how to expose the multiplicities while also keeping the interface familiar to Swift users. (SetAlgebra definitely can't handle multisets, but its high-level operations translate pretty well, I think.)
API design-wise, I expect a counted multiset will have a very different interface than a dupe-preserving one. For example, would CountedUnorderedMultiset be a Collection? If so, what would be its Element type?
As CountedSet
(aka CountedUnorderedMultiset) is conceptually a Dictionary<Key, Int>
, I think it would make sense to have an equivalent Element type: (Key, Int)
. Similarly, I think it would be OK for CountedSet
to conform to Collection like Dictionary
. However, arguably Dictionary
's direct collection conformance is a mistake, so maybe CountedSet
shouldn't follow its precedent there (we could vend a collection view instead).