Add support for NavigableSet methods in SortedSetIterable
Add NavigableSet support to SortedSetIterable
๐ Summary
Adds full NavigableSet support to SortedSetIterable hierarchy, resolving issue #1746.
๐ฏ What's Changed
New Methods Added
- Navigation methods:
lower(),floor(),ceiling(),higher() - Descending operations:
descendingSet(),descendingIterator() - Inclusive/exclusive subsets:
subSet(from, fromInc, to, toInc),headSet(to, inc),tailSet(from, inc)
Key Changes
- โ
MutableSortedSetnow implementsNavigableSet<T>(wasSortedSet<T>) - โ
toReversed()now works (no longer throwsUnsupportedOperationException) - โ
Default implementations in
SortedSetIterableforreverseForEach(),reverseForEachWithIndex(),detectLastIndex() - โ All implementations covered: mutable, immutable, wrappers (synchronized, unmodifiable, adapter)
๐งช Testing
- Comprehensive tests added for all new methods
- Edge cases covered: empty sets, missing elements, double descending, custom comparators
- Existing tests continue to pass
๐ Implementation Details
ImmutableTreeSet Internal Change
Changed: T[] delegate โ MutableList<T> delegate
Rationale:
- Uses
FastList.newListWith(array)which wraps the array without copying - Trade-off: +4 bytes (1 reference) per instance vs reimplementing all
RichIterable/OrderedIterableoperations - Enables structural sharing:
subSet(),headSet(),tailSet(), anddescendingSet()return views that share the underlying structure, followingNavigableSetsemantics and matchingAbstractImmutableListbehavior
Would appreciate maintainer feedback on this approach
Structural Sharing
- Important:
ImmutableTreeSetnow uses structural sharing for subset operations (likeNavigableSet.subSet()andAbstractImmutableList) - Views share the underlying
MutableListrather than copying data - This follows the
NavigableSetspec and is consistent with other Eclipse Collections immutable implementations (e.g.AbstractImmutableList.subList())
ReversedList Implementation Choice
Current Approach:
- Private static
ReversedListclass inImmutableTreeSetextendingUnmodifiableMutableList - Localized implementation specific to
ImmutableTreeSetneeds
Alternative Considered - Global Implementation:
- Could have implemented a global
FastList.ReverseFastList implements MutableList, ReversibleIterable - Pros:
- Would enable code reuse (e.g., for future
NavigableMapsupport onSortedMapIterable) - Could directly share the underlying array with the original
FastList(saving 1 reference vs current approach)
- Would enable code reuse (e.g., for future
- Cons: Requires larger refactoring
- Currently
ReversibleIterable.asReversed()returnsLazyIterable, notRichIterable - Would need to implement
ReverseIterable, and thusLazyIterable, to matchAbstractMutableList.asReversed()
- Currently
Decision: Chose the localized approach to minimize scope of changes while keeping the door open for future refactoring if NavigableMap support is added.
๐๏ธ Architecture Notes
NavigableSet Inheritance Strategy
API Hierarchy:
SortedSetIterableprovides the same API asNavigableSetbut does not extend itMutableSortedSet extends NavigableSetโ (direct extension, no issues)ImmutableSortedSetdoes not extendNavigableSet
Concrete Implementations:
AbstractImmutableSortedSetdoes extend NavigableSet- This requires concrete method return types (e.g.,
AbstractImmutableSortedSet headSet(T)instead ofImmutableSortedSet headSet(T))
Rationale:
- Maintains flexibility at the interface level (
SortedSetIterableis not tied to JDK types) - Concrete implementations can satisfy
NavigableSetcontract for Java interoperability - Follows Eclipse Collections pattern of separating API contracts from JDK implementations
๐ Notes
- Semantics match
NavigableSetspec exactly (e.g.,subSet()returns views, not copies) - JavaDoc to be discussed with maintainers (inherit from
NavigableSetvs explicit documentation)
I'm in the middle of reviewing this because it's a big patch but so far it looks really good - just wanted to say thanks.
@motlin
And if you don't mind me asking, I'm curious which AI tools you used for this PR. The code doesn't look AI generated, but the PR description looks like it may be.
I used a little bit of Anthropic's Opus for analysis and Sonnet for some code, but wrote critical parts by hands.
But, I used 99% LLM for the PR description as, all in all, while it's very emoji-y, it does make it pretty clear. (Plus, English isn't my main language, so better safe than sorry).
Overall this looks great, and I left a few questions and nitpicky comments.
I've been a very happy user of EC for almost 10 years now, so I'm glad I finally found the time to tackle this :) (One day, if I manage to reduce my workload, I'll try to push for Clojure-like persistent data-structures ;] That would be a very fun project.)
Thing is... I thought about it lengthily today, and found that my solution has glaring pitfalls. Main one is : I rely on FastList, which means some methods use a direct loop over the underlying array to iterate instead of iterator (e.g. injectInto).
As such, the reverse view, descendingSet, is totally buggy for these cases. Tracking which methods we'd have to override depending on the inner implementation of FastList would be extremely leaky design and a nightmare to maintain.
I'm thinking of redoing the code from scratch, as that problem can't be circumvented with the current solution.
What I'm thinking as a better solution would be to implement a proper FastList-like Immutable variant (e.g. ImmutableInPlaceArrayAdaptee), which would be unsafe to use outside of the eclipse-collections code for unknowing users, but would allow for efficient reversing / sublisting with full structural sharing and no wrappers.
The idea would be :
- avoid the array copy in the ctor, hence the class being 100% unsafe if used if a reference to the array is kept
- implement something akin to IntInterval, with final fields from/to, and methods for reverse/subList
- keep that range as a
private finalfield, to handle both reversing and sublisting as O(1) in time/memory, avoiding any unnecessary wrappers and thus indirections - implement iterations with direct loops over the underlying array, using a local
from/to(similar to howIntInteralhandles interations internally) - reverse iteration would only cost one
ifovergetStep(), which the JIT should be able to optimize easily - that solution could in turn be used for an hypothetic
NavigableMapimplementation forImmutableTreeMap
One thing I'd like to explore is fuller test coverage by moving tests into the unit-tests-java8 hierarchy. I'm going to give it a shot by asking Claude to implement sorted equivalents of unit-tests-java8/src/test/java/org/eclipse/collections/test/set/SetTestCase.java and unit-tests-java8/src/test/java/org/eclipse/collections/test/set/mutable/HashSetTest.java.
Yes, I forgot some things (floor and such on the unmodifiable variant), and test coverage should be vastly improved. Especially, It would require a big refactoring of AbstractSortedSetTestCase to check that views (descendingSet, headSet, tailSet, subSet) are 100% correct... which is not the case with the current implementation as I said earlier.
I'll try to implement a cleaner version with that idea to cover all these pitfalls if you agree it might be a more sensible approach.
ps: An illustration of what I mean :
/*
* class is unsafe, but could be used internally for :
* - ImmutableTreeSet/ImmutableTreeMap
* - places where we're 100% sure to have a safe array and want to return an ImmutableList
*/
public final class ImmutableInPlaceArrayAdapter<T> extends AbstractImmutableList<T>
{
private final T[] array;
private final int from;
private final int to;
// no copy, like FastList
public ImmutableInPlaceArrayAdapter<T>(T[] array)
{
// throw on empty array -> should use ImmutableEmptyList instead
this(array, 0, array.length - 1);
}
private ImmutableInPlaceArrayAdapter<T>(T[] array, int from, int to)
{
this.array = array;
this.from = from;
this.to = to;
}
private int physicalIndex(int logicalIndex)
{
if (from <= to)
{
return to - logicalIndex;
}
else
{
return fromIndex + localIndex;
}
}
@Override
public int size()
{
return to - from;
}
// constant and inplace
@Override
public ImmutableInPlaceArrayAdapter toReversed()
{
return new ImmutableInPlaceArrayAdapter<>(this.array, this.to, this.from);
}
// constant and inplace
@Override
public ImmutableInPlaceArrayAdapter subList(int fromInclusive int toExclusive)
{
// range check
return new ImmutableInPlaceArrayAdapter<>(this.array, physicalIndex(fromInclusive), physicalIndex(toExclusive) - 1);
}
@Override
public void each(Procedure proc)
{
// JIT should optimize the condition after the first test
// if not, replace it with an explicit final boolean field
if (from <= to)
{
// direct iteration over the array => better performance
for (int i = from; i <= to; ++i)
{
proc.apply(array[i]);
}
}
else
{
for (int i = to; i <= from; --i)
{
proc.apply(array[i]);
}
}
}
/* ... */
/*
* iterator / listIterator
*/
}
I know that this class would bring additional complexity and maintenance costs over the existing ImmutableArrayList, which does the same thing but with some slight differences :
- safe to use as it copies the input array
toReversedisn't in place
So I wouldn't want to change that class and risk breaking existing behaviors.
All of this could be somewhat circumvented by making this new class abstract, thus forcing end-users to inherit from it to use it. That way, users can't complaint they didn't understand what they were doing : they had to very explicitly wish for that behavior.
Honestly, I think this implementation is already pretty close to ready and doesn't warrant a rewrite. ImmutableTreeSet used to be backed by an array. Having it be backed by a list is fine too IMO. If it's an ImmutableArrayList, you know it's safe. If you take a subset of a subset, that would delegate to sublist of a sublist, which is already optimized to reduce down to a single view. Or if you want to keep an array for memory efficiency, you can delegate to ArrayAdapter for iteration patterns in many cases.
Problem is : in theory it is.
In practice, as I use FastList to avoid copying the array, I'd have to override all methods where iteration is done internally with a loop => I can do it right now, but the coupling will make maintenance painful going forward.
I could've used ArrayAdapter instead, but in-place reverse would've been either impossible (thus breaking descendingSet contract) or painfully hacky.
I might be too obstinate or fearful here, but I seriously think my current approach must be scrapped, as it's buggy, and making it work would lead to a brittle solution.
I want to try to come up with another PR that focuses on correctness as an alternative, albeit more verbose.
I'll also try to make my current approach correct, but I'm not confident it will be at least good enough, sadly :(
Then, your call on which solution to pick :)
[edit] Existing wrappers comparisons :
| class | inplace ctor | implace subList | implace toReversed |
|---|---|---|---|
| ImmutableArrayList | KO (defensive copy) | OK (ImmutableSubList) | KO (copy over asReversed) |
| ArrayAdapter | OK | OK? (ListAdapter over RandomAccessSubList) | KO (clone) |
Comparing to TreeSet, which delegate to TreeMap, where they manage to keep everything in place with at most 1 indirection. They achieves this with a very complex implementation with a combination of TreeMap, DescendingSubMap and AscendingSubMap. Doing so, they guarantee at most one wrapper, even when doing crazy things like new TreeSet<>(Arrays.asList(1, 2, 3)).descendingSet().subSet(2, 1).descendingSet().subSet(2, 2).
So what I'm trying to achieve with my new adapter is to replicate this specific behavior : inplace reverse/slicing and while we keep indirections to a minimum.
I'm looking around at the adapters we have, and we're missing an adapter for a random access list that is itself a random access list. We have ReverseIterable, but that doesn't provide indexed access.
My thought about ImmutableInPlaceArrayAdapter is that it's pretty specific. A reverse list adapter would be useful anyway, so we might as well build it. And then it might be possible to build what we need with several wrappers. Collapsing them all into a single wrapper might be a micro optimization, but we also may not need to do it.
My thought about ImmutableInPlaceArrayAdapter is that it's pretty specific.
I tried my hand at implementing it to show what that could look like : https://github.com/eclipse-collections/eclipse-collections/compare/master...tampix:eclipse-collections:navigableset2
I kept it very straightforward as my aim was to optimize for memory-efficiency, not raw performance (except when it makes sense). So most methods forward to RandomAccessListIterate.
As it's still in draft, I've not added specific unit-tests yet, just the inherited ones, but it does the job as a PoC.
A reverse list adapter would be useful anyway, so we might as well build it. And then it might be possible to build what we need with several wrappers. Collapsing them all into a single wrapper might be a micro optimization, but we also may not need to do it.
That might a better solution than the one I suggested. One unique list adapter optimized for both slicing and reversing would be more generic. I could see something like this being useful to implement things like chunk, sliding windows and sub. But it might take a little bit more work to implement than my quick and dirty solution :)