vavr icon indicating copy to clipboard operation
vavr copied to clipboard

Let Traversable.grouped()/sliding() return GroupedIterator instead of Iterator

Open danieldietrich opened this issue 8 years ago • 2 comments

Scala's GroupedIterator has the ability of padding and skipping partial results (both not implemented in our version, yet).

It would be nice if we could provide the same functionality. Returning a GroupedIterator will not be backward compatible.

Here is an example implementation similar to that of Scala:

final class GroupedIterator<T> extends AbstractIterator<Seq<T>> {

    private final Iterator<T> self;
    private final int size;
    private final int step;
    private final int gap;


    private Seq<T> buffer = Vector();                       // the buffer
    private boolean filled = false;                         // whether the buffer is "hot"
    private boolean partial = true;                         // whether we deliver short sequences
    private Option<Supplier<T>> pad = None();               // what to pad short sequences with

    GroupedIterator(Iterator<T> self, int size, int step) {
        if (size < 1 || step < 1) {
            throw new IllegalArgumentException("size (" + size + ") and step (" + step + ") must both be positive");
        }
        this.self = self;
        this.size = size;
        this.step = step;
        this.gap = Math.max(step - size, 0);
    }

    // -- builders

    @SuppressWarnings("unchecked")
    public GroupedIterator<T> withPadding(Supplier<? extends T> paddingSupplier) {
        pad = Some((Supplier<T>) paddingSupplier);
        return this;
    }

    public GroupedIterator<T> withPartial(boolean partial) {
        this.partial = partial;
        if (partial) {
            // reset pad since otherwise it will take precedence
            pad = None();
        }
        return this;
    }

    // -- Iterator impl

    @Override
    public boolean hasNext() {
        return filled || fill();
    }

    @Override
    protected Seq<T> getNext() {
        filled = false;
        return buffer;
    }

    // -- helpers

    // fill() returns false if no more sequences can be produced
    private boolean fill() {
        if (!self.hasNext()) {
            return false;
        } else {
            // the first time we grab size, but after that we grab step
            final int count = buffer.isEmpty() ? size : step;
            return go(count);
        }
    }

    private boolean go(int count) {

        final int prevSize = buffer.size();
        final boolean isFirst = prevSize == 0;
        final Seq<T> xs; {
            // If there is padding defined we insert it immediately so the rest of the code can be oblivious
            Seq<T> res = takeDestructively(count);
            // but since we took the group eagerly, just use the fast length
            final int shortBy = count - res.size();
            if (shortBy > 0 && pad.isDefined()) {
                final Iterable<T> padding = Iterator.fill(shortBy, pad.get());
                res = res.appendAll(padding);
            }
            xs = res;
        }

        final int len = xs.size();
        final boolean incomplete = len < count;

        // if 0 elements are requested, or if the number of newly obtained
        // elements is less than the gap between sequences, we are done.
        final Function<Integer, Boolean> deliver = howMany -> {
            if (howMany > 0 && (isFirst || len > gap)) {
                if (!isFirst) {
                    buffer = buffer.drop(Math.min(step, prevSize));
                }
                final int available = isFirst ? len : Math.min(howMany, len - gap);
                buffer = buffer.appendAll(xs.takeRight(available));
                filled = true;
                return true;
            } else {
                return false;
            }
        };

        if (xs.isEmpty()) {
            // self ran out of elements
            return false;
        } else if (partial) {
            // if _partial is true, we deliver regardless
            return deliver.apply(Math.min(len, size));
        } else if (incomplete) {
            // !_partial && incomplete means no more seqs
            return false;
        } else if (isFirst) {
            // first element
            return deliver.apply(len);
        } else {
            // the typical case
            return deliver.apply(Math.min(step, size));
        }
    }

    private Seq<T> takeDestructively(int size) {
        final java.util.List<T> buf = new ArrayList<>();
        // We check i < size before self.hasNext() because self.hasNext() could be blocking
        for (int i = 0; i < size && self.hasNext(); i++) {
            buf.add(self.next());
        }
        return Vector.ofAll(buf);
    }
}

We need to fuse that with our existing impl. Maybe we can simplify things a bit.

danieldietrich avatar Mar 10 '17 16:03 danieldietrich

I will be able to solve this issue without introducing a backw'incompat change.

W.l.o.g., let's take a look at Iterator.grouped(int)`. It returns an Iterator containing Seq's (currently: Stream):

    @Override
    default Iterator<Seq<T>> grouped(int size) {
        return new GroupedIterator<>(this, size, size);
    }

The collections have:

// Traversable
Iterator<? extends Traversable<T>> grouped(int size);

// Seq
@Override
Iterator<? extends Seq<T>> grouped(int size);

// IndexedSeq
@Override
Iterator<? extends IndexedSeq<T>> grouped(int size);

// Set
@Override
Iterator<? extends Set<T>> grouped(int size);

// and so on...

Proposed solution:

  1. Add <U extends Iterable> Iterable<U> Iterator.grouped(int, Function<? super Iterable, ? extends U>)
  2. Let Traversable.grouped(int) return this.iterator().grouped(int, this::ofAll)

Note: This would be also one step towards #1945

danieldietrich avatar Jul 05 '19 11:07 danieldietrich

Another topic: grouped, sliding (and other methods) might need to be moved from Traversable to Seq because they make only sense if elements have a specific order...

danieldietrich avatar Jul 05 '19 11:07 danieldietrich