eclipse-collections icon indicating copy to clipboard operation
eclipse-collections copied to clipboard

Add support for NavigableSet methods in SortedSetIterable

Open tampix opened this issue 2 months ago โ€ข 6 comments

Add NavigableSet support to SortedSetIterable

๐Ÿ“‹ Summary

Adds full NavigableSet support to SortedSetIterable hierarchy, resolving issue #1746.

๐ŸŽฏ What's Changed

New Methods Added

  • Navigation methods: lower(), floor(), ceiling(), higher()
  • Descending operations: descendingSet(), descendingIterator()
  • Inclusive/exclusive subsets: subSet(from, fromInc, to, toInc), headSet(to, inc), tailSet(from, inc)

Key Changes

  • โœ… MutableSortedSet now implements NavigableSet<T> (was SortedSet<T>)
  • โœ… toReversed() now works (no longer throws UnsupportedOperationException)
  • โœ… Default implementations in SortedSetIterable for reverseForEach(), reverseForEachWithIndex(), detectLastIndex()
  • โœ… All implementations covered: mutable, immutable, wrappers (synchronized, unmodifiable, adapter)

๐Ÿงช Testing

  • Comprehensive tests added for all new methods
  • Edge cases covered: empty sets, missing elements, double descending, custom comparators
  • Existing tests continue to pass

๐Ÿ” Implementation Details

ImmutableTreeSet Internal Change

Changed: T[] delegate โ†’ MutableList<T> delegate

Rationale:

  • Uses FastList.newListWith(array) which wraps the array without copying
  • Trade-off: +4 bytes (1 reference) per instance vs reimplementing all RichIterable/OrderedIterable operations
  • Enables structural sharing: subSet(), headSet(), tailSet(), and descendingSet() return views that share the underlying structure, following NavigableSet semantics and matching AbstractImmutableList behavior

Would appreciate maintainer feedback on this approach

Structural Sharing

  • Important: ImmutableTreeSet now uses structural sharing for subset operations (like NavigableSet.subSet() and AbstractImmutableList)
  • Views share the underlying MutableList rather than copying data
  • This follows the NavigableSet spec and is consistent with other Eclipse Collections immutable implementations (e.g. AbstractImmutableList.subList())

ReversedList Implementation Choice

Current Approach:

  • Private static ReversedList class in ImmutableTreeSet extending UnmodifiableMutableList
  • Localized implementation specific to ImmutableTreeSet needs

Alternative Considered - Global Implementation:

  • Could have implemented a global FastList.ReverseFastList implements MutableList, ReversibleIterable
  • Pros:
    • Would enable code reuse (e.g., for future NavigableMap support on SortedMapIterable)
    • Could directly share the underlying array with the original FastList (saving 1 reference vs current approach)
  • Cons: Requires larger refactoring
    • Currently ReversibleIterable.asReversed() returns LazyIterable, not RichIterable
    • Would need to implement ReverseIterable, and thus LazyIterable, to match AbstractMutableList.asReversed()

Decision: Chose the localized approach to minimize scope of changes while keeping the door open for future refactoring if NavigableMap support is added.

๐Ÿ—๏ธ Architecture Notes

NavigableSet Inheritance Strategy

API Hierarchy:

  • SortedSetIterable provides the same API as NavigableSet but does not extend it
  • MutableSortedSet extends NavigableSet โœ… (direct extension, no issues)
  • ImmutableSortedSet does not extend NavigableSet

Concrete Implementations:

  • AbstractImmutableSortedSet does extend NavigableSet
  • This requires concrete method return types (e.g., AbstractImmutableSortedSet headSet(T) instead of ImmutableSortedSet headSet(T))

Rationale:

  • Maintains flexibility at the interface level (SortedSetIterable is not tied to JDK types)
  • Concrete implementations can satisfy NavigableSet contract for Java interoperability
  • Follows Eclipse Collections pattern of separating API contracts from JDK implementations

๐Ÿ“ Notes

  • Semantics match NavigableSet spec exactly (e.g., subSet() returns views, not copies)
  • JavaDoc to be discussed with maintainers (inherit from NavigableSet vs explicit documentation)

tampix avatar Oct 09 '25 10:10 tampix

I'm in the middle of reviewing this because it's a big patch but so far it looks really good - just wanted to say thanks.

motlin avatar Oct 11 '25 12:10 motlin

@motlin

And if you don't mind me asking, I'm curious which AI tools you used for this PR. The code doesn't look AI generated, but the PR description looks like it may be.

I used a little bit of Anthropic's Opus for analysis and Sonnet for some code, but wrote critical parts by hands.

But, I used 99% LLM for the PR description as, all in all, while it's very emoji-y, it does make it pretty clear. (Plus, English isn't my main language, so better safe than sorry).

Overall this looks great, and I left a few questions and nitpicky comments.

I've been a very happy user of EC for almost 10 years now, so I'm glad I finally found the time to tackle this :) (One day, if I manage to reduce my workload, I'll try to push for Clojure-like persistent data-structures ;] That would be a very fun project.)

Thing is... I thought about it lengthily today, and found that my solution has glaring pitfalls. Main one is : I rely on FastList, which means some methods use a direct loop over the underlying array to iterate instead of iterator (e.g. injectInto).

As such, the reverse view, descendingSet, is totally buggy for these cases. Tracking which methods we'd have to override depending on the inner implementation of FastList would be extremely leaky design and a nightmare to maintain.

I'm thinking of redoing the code from scratch, as that problem can't be circumvented with the current solution.

What I'm thinking as a better solution would be to implement a proper FastList-like Immutable variant (e.g. ImmutableInPlaceArrayAdaptee), which would be unsafe to use outside of the eclipse-collections code for unknowing users, but would allow for efficient reversing / sublisting with full structural sharing and no wrappers.

The idea would be :

  • avoid the array copy in the ctor, hence the class being 100% unsafe if used if a reference to the array is kept
  • implement something akin to IntInterval, with final fields from/to, and methods for reverse/subList
  • keep that range as a private final field, to handle both reversing and sublisting as O(1) in time/memory, avoiding any unnecessary wrappers and thus indirections
  • implement iterations with direct loops over the underlying array, using a local from/to(similar to how IntInteral handles interations internally)
  • reverse iteration would only cost one if over getStep(), which the JIT should be able to optimize easily
  • that solution could in turn be used for an hypothetic NavigableMap implementation for ImmutableTreeMap

One thing I'd like to explore is fuller test coverage by moving tests into the unit-tests-java8 hierarchy. I'm going to give it a shot by asking Claude to implement sorted equivalents of unit-tests-java8/src/test/java/org/eclipse/collections/test/set/SetTestCase.java and unit-tests-java8/src/test/java/org/eclipse/collections/test/set/mutable/HashSetTest.java.

Yes, I forgot some things (floor and such on the unmodifiable variant), and test coverage should be vastly improved. Especially, It would require a big refactoring of AbstractSortedSetTestCase to check that views (descendingSet, headSet, tailSet, subSet) are 100% correct... which is not the case with the current implementation as I said earlier.

I'll try to implement a cleaner version with that idea to cover all these pitfalls if you agree it might be a more sensible approach.

ps: An illustration of what I mean :

/*
 * class is unsafe, but could be used internally for :
 * - ImmutableTreeSet/ImmutableTreeMap
 * - places where we're 100% sure to have a safe array and want to return an ImmutableList
 */
public final class ImmutableInPlaceArrayAdapter<T> extends AbstractImmutableList<T>
{
    private final T[] array;
    private final int from;
    private final int to;

    // no copy, like FastList 
    public ImmutableInPlaceArrayAdapter<T>(T[] array)
    {
        // throw on empty array -> should use ImmutableEmptyList instead
        this(array, 0, array.length - 1);
    }

    private ImmutableInPlaceArrayAdapter<T>(T[] array, int from, int to)
    {
        this.array = array;
        this.from = from;
        this.to = to;
    }

    private int physicalIndex(int logicalIndex)
    {
        if (from <= to)
        {
            return to - logicalIndex;
        }
        else
        {
            return fromIndex + localIndex;
        }
    }

    @Override
    public int size()
    {
        return to - from;
    }

    // constant and inplace
    @Override
    public ImmutableInPlaceArrayAdapter toReversed()
    {
        return new ImmutableInPlaceArrayAdapter<>(this.array, this.to, this.from);
    }

    // constant and inplace
    @Override
    public ImmutableInPlaceArrayAdapter subList(int fromInclusive int toExclusive)
    {
        // range check
        return new ImmutableInPlaceArrayAdapter<>(this.array, physicalIndex(fromInclusive), physicalIndex(toExclusive) - 1);
    }

    @Override
    public void each(Procedure proc)
    {
        // JIT should optimize the condition after the first test
        // if not, replace it with an explicit final boolean field
        if (from <= to)
        {
            // direct iteration over the array => better performance
            for (int i = from; i <= to; ++i)
            {
                proc.apply(array[i]);
            }
        }
        else
        {
            for (int i = to; i <= from; --i)
            {
                proc.apply(array[i]);
            }
        }
    }

    /* ... */

    /*
     * iterator / listIterator
     */
}

I know that this class would bring additional complexity and maintenance costs over the existing ImmutableArrayList, which does the same thing but with some slight differences :

  • safe to use as it copies the input array
  • toReversed isn't in place

So I wouldn't want to change that class and risk breaking existing behaviors.

All of this could be somewhat circumvented by making this new class abstract, thus forcing end-users to inherit from it to use it. That way, users can't complaint they didn't understand what they were doing : they had to very explicitly wish for that behavior.

tampix avatar Oct 11 '25 18:10 tampix

Honestly, I think this implementation is already pretty close to ready and doesn't warrant a rewrite. ImmutableTreeSet used to be backed by an array. Having it be backed by a list is fine too IMO. If it's an ImmutableArrayList, you know it's safe. If you take a subset of a subset, that would delegate to sublist of a sublist, which is already optimized to reduce down to a single view. Or if you want to keep an array for memory efficiency, you can delegate to ArrayAdapter for iteration patterns in many cases.

motlin avatar Oct 11 '25 20:10 motlin

Problem is : in theory it is.

In practice, as I use FastList to avoid copying the array, I'd have to override all methods where iteration is done internally with a loop => I can do it right now, but the coupling will make maintenance painful going forward.

I could've used ArrayAdapter instead, but in-place reverse would've been either impossible (thus breaking descendingSet contract) or painfully hacky.

I might be too obstinate or fearful here, but I seriously think my current approach must be scrapped, as it's buggy, and making it work would lead to a brittle solution.

I want to try to come up with another PR that focuses on correctness as an alternative, albeit more verbose.

I'll also try to make my current approach correct, but I'm not confident it will be at least good enough, sadly :(

Then, your call on which solution to pick :)

[edit] Existing wrappers comparisons :

class inplace ctor implace subList implace toReversed
ImmutableArrayList KO (defensive copy) OK (ImmutableSubList) KO (copy over asReversed)
ArrayAdapter OK OK? (ListAdapter over RandomAccessSubList) KO (clone)

Comparing to TreeSet, which delegate to TreeMap, where they manage to keep everything in place with at most 1 indirection. They achieves this with a very complex implementation with a combination of TreeMap, DescendingSubMap and AscendingSubMap. Doing so, they guarantee at most one wrapper, even when doing crazy things like new TreeSet<>(Arrays.asList(1, 2, 3)).descendingSet().subSet(2, 1).descendingSet().subSet(2, 2).

So what I'm trying to achieve with my new adapter is to replicate this specific behavior : inplace reverse/slicing and while we keep indirections to a minimum.

tampix avatar Oct 12 '25 01:10 tampix

I'm looking around at the adapters we have, and we're missing an adapter for a random access list that is itself a random access list. We have ReverseIterable, but that doesn't provide indexed access.

My thought about ImmutableInPlaceArrayAdapter is that it's pretty specific. A reverse list adapter would be useful anyway, so we might as well build it. And then it might be possible to build what we need with several wrappers. Collapsing them all into a single wrapper might be a micro optimization, but we also may not need to do it.

motlin avatar Oct 12 '25 13:10 motlin

My thought about ImmutableInPlaceArrayAdapter is that it's pretty specific.

I tried my hand at implementing it to show what that could look like : https://github.com/eclipse-collections/eclipse-collections/compare/master...tampix:eclipse-collections:navigableset2

I kept it very straightforward as my aim was to optimize for memory-efficiency, not raw performance (except when it makes sense). So most methods forward to RandomAccessListIterate.

As it's still in draft, I've not added specific unit-tests yet, just the inherited ones, but it does the job as a PoC.

A reverse list adapter would be useful anyway, so we might as well build it. And then it might be possible to build what we need with several wrappers. Collapsing them all into a single wrapper might be a micro optimization, but we also may not need to do it.

That might a better solution than the one I suggested. One unique list adapter optimized for both slicing and reversing would be more generic. I could see something like this being useful to implement things like chunk, sliding windows and sub. But it might take a little bit more work to implement than my quick and dirty solution :)

tampix avatar Oct 12 '25 15:10 tampix