jOOL icon indicating copy to clipboard operation
jOOL copied to clipboard

Inconsistencies in SeqUtils.transform's Spliterator.characteristics() and Spliterator.getComparator()

Open tlinkowski opened this issue 8 years ago • 7 comments

I think I found some inconsistencies in SeqUtils.transform concerning the method characteristics().

We have:

            @Override
            public int characteristics() {
                return delegate.characteristics() & Spliterator.ORDERED;
            }
            
            @Override
            @SuppressWarnings("unchecked")
            public Comparator<? super U> getComparator() {
                
                // This implementation works with the JDK 8, as the information
                // is really only used in 
                // java.util.stream.StreamOpFlag.fromCharacteristics(Spliterator<?> spliterator)
                // Currently, the point of this method is only to be used for
                // optimisations (e.g. to avoid sorting a stream twice in a row)
                return (Comparator) delegate.getComparator();
            }

But since in calculating the characteristics above we use bitwise AND (&) with Spliterator.ORDERED, ths Spliterator will never report Spliterator.SORTED, and so the method getComparator will never be called in StreamOpFlag.fromCharacteristics:

(characteristics & Spliterator.SORTED) != 0 && spliterator.getComparator() != null

I think characteristics() should return something like this:

            public int characteristics() {
                return (delegate.characteristics() | Spliterator.ORDERED) & ~(Spliterator.SIZED | Spliterator.SUBSIZED);
            }

which would mean "we preserve the original characteristics, adding ORDERED if absent, and removing SIZED and SUBSIZED if present".

tlinkowski avatar May 01 '17 08:05 tlinkowski

Interesting observation. Indeed, perhaps the existing operation is too agressive. Would you mind explaining the rationale behind your version (the part about SIZED and SUBSIZED)? Are none of the other characteristics affected, in your opinion?

lukaseder avatar May 01 '17 17:05 lukaseder

Well, as far as I understand, SeqUtils.transform always returns a subset of elements from the source (never adding any elements) so:

  • it cannot be SIZED nor SUBSIZED (some element may have been removed)
  • it is still DISTINCT, NONNULL, and IMMUTABLE (nothing added)
  • it is still SORTED (nothing reordered)
  • I don't know about CONCURRENT - I guess it's safer to remove it too
  • as for ORDERED, I assumed we want to impose it because the docs say that Seq is seqential and ordered

tlinkowski avatar May 01 '17 18:05 tlinkowski

Thanks for taking the time to explain this (and sorry again for my laziness here. jOOQ has been keeping me rather busy, recently). You made good points. If you're interested, I think this looks good for a PR. I think we could put your explanation there as a comment, too with a reference to this issue, so we'll still remember this in the future.

lukaseder avatar May 08 '17 10:05 lukaseder

Sure, I'll provide a PR.

PS. As to response time, I'm absolutely fine with responses in the spirit of "eventual consistency" ;)

tlinkowski avatar May 08 '17 11:05 tlinkowski

Yeah. "Eventually", my inbox with notifications reaches zero, right? :)

lukaseder avatar May 08 '17 14:05 lukaseder

Heh, you're right - the analogy is flawed :)

tlinkowski avatar May 09 '17 07:05 tlinkowski

Hm, I took a much closer look and I found the following:

  • in Seq.onEmptyGet we cannot preserve NONNULL because the Supplier may return null
  • in Seq.cycle we cannot preserve DISTINCT nor SORTED (but I'd like to change its implementation to one using SeqBuffer so it wouldn't matter)
  • in Seq.shuffle we cannot preserve SORTED (but I've already proposed a new implementation using SeqUtils.lazy so it wouldn't matter either)
  • in Seq.scanLeft we have a mapper function so we cannot preserve DISTINCT, NONNULL, nor SORTED but it seems this implementation can be easily simplified to the following:
return Seq.of(seed).concat(stream.map(t -> value[0] = function.apply(value[0], t)));

(which seems safe unless someone calls stream().parallel() - to alleviate that we'd need some DelegatingSequentialSpliterator class that would delegate all methods except trySplit and that would not report CONCURRENT)

tlinkowski avatar May 09 '17 08:05 tlinkowski