fastutil icon indicating copy to clipboard operation
fastutil copied to clipboard

BooleanMap/Set implementations should delegate to a super simple data structure

Open techsy730 opened this issue 4 years ago • 10 comments

Instead of having grand data structures for a keyspace of only 2 elements, all BooleanMap/Set implementations should delegate to a super simple implementation that only has whether true is present, whether false is present, and for maps, the true value and the false value. This should be super fast and super compact in memory.

The only complexity would be that for some implementations, a sorted order needs to be maintained (those that implement SortedBooleanMap/Set). In others, we need to maintain insertion order. (For implementations that do not assure a key order, we can just use the sorted order for free).

techsy730 avatar Jul 17 '19 23:07 techsy730

Oh, I didn't notice that there aren't even Boolean2WhateverMaps generated. So I guess make this feature request only about BooleanSet

techsy730 avatar Jul 18 '19 00:07 techsy730

"All BooleanMap/Set implementations" sounds a bit grand for a single implementation: BooleanOpenHashSet. LOL.

vigna avatar Jul 18 '19 06:07 vigna

Yep, the whole point is that something like "BooleanOpenHashSet" is complete overkill for booleans. It should just be two boolean variables (whether true and false are present). All of the boolean sets should be. The only reason I didn't suggest getting rid of all the boolean set implementations and just have one or two of this sort of super simple implementation is for backwards compatibility.

techsy730 avatar Jul 18 '19 16:07 techsy730

Ok. In the interest of clarity, can you name another implementation beside BooleanOpenHashSet?

vigna avatar Jul 19 '19 05:07 vigna

Hmm, looking over it, BooleanOpenHashSet and BooleanArraySet are the only concrete set type for booleans currently. There isn't even a SortedSet implementation.

So I guess then we can just make a new set implementation, BooleanSimpleSet or BooleanFastSet or BooleanDirectSet or something, that is just two boolean member variables.

Optionally, we can then make those two set implementations delegate to that implementation (though care would be needed for BooleanArraySet to preserve insertion order)

techsy730 avatar Aug 21 '19 17:08 techsy730

+1. BooleanOpenHashSet and BooleanArraySet should be deprecated.

leventov avatar Oct 28 '19 08:10 leventov

I'd just write ten-line specific implementations of those, with a two-element boolean array remembering whether true or false are in the set. Pull requests, anybody?

vigna avatar Oct 29 '19 19:10 vigna

I might be willing to try this. I just had my hands full with the Spliterator stuff at the time. However, implementing NavigableMap may be a better first step.

techsy730 avatar Dec 22 '20 00:12 techsy730

Yes I would try to release 8.5.0 with #162 and the attack navigabile. There's even a proposal from 2017 with the interfaces 🙈.

vigna avatar Dec 22 '20 07:12 vigna

Given that a boolean set can have only 4 possible states, {}, {false}, {true}, and {false, true}, it would make sense to have an immutable boolean set implementation alongside the mutable one.

techsy730 avatar Dec 30 '20 01:12 techsy730