vavr icon indicating copy to clipboard operation
vavr copied to clipboard

Add tap() and/or tapEach() methods

Open Abnaxos opened this issue 3 years ago • 5 comments

Collections, Options etc. should have a tap(Consumer<? super T>) method.

This is related to #2424, where I criticised the inconsistent behaviour of peek(). I still do, BTW.

Problem at hand: there has to be a way to call a function on each element of a collection/option/etc solely for its side-effect (e.g. logging). peek() isn't the right method for this, because depending on the underlying implementation, it may only be applied to the first element of the collection. AFAIK, you can use mySeq.toSteam().peek(e -> ...) to make sure that the function will be applied to each element, but I consider this dangerous (just forget toStream() and you don't know, how exactly peek() will behave, it may not do what you wanted it to do – this can be hard to spot).

Scala 2.13 introduced tap() and tapEach() for this purpose. Add this to Vavr as well.

Abnaxos avatar Jun 15 '21 07:06 Abnaxos

let's do it! It is great to sync with Scala. Could you clarify the difference between/define the semantics of peek/tap, forEach/tapEach?

danieldietrich avatar Jun 15 '21 13:06 danieldietrich

OK, first of all, I just saw that toStream().peek() won't do what I'd expect. It always returns only the first element, again and again.

tap() is for single values, therefore probably not interesting for Vavr. Now let's look at tapEach. I'll first establish the workaround I'm currently using to get the behaviour I want:

public final class VavrX {
  public static <T> Function<T, T> touch(Consumer<? super T> consumer) {
    return v -> {
      consumer.accept(v);
      return v;
    };
  }
}

So, touch() takes a consumer and returns a function that calls that consumer's accept() with the input value, then returns the input value unchanged.

Let's look at this small program:

public class Tap {

  public static void main(String[] args) {
    var src = List.of(1, 2, 3);
    System.out.println("peek()");
    src.peek(i -> System.out.println("in: " + 1))
        .map(Tap::dup)
        .forEach(i -> System.out.println("out: " + i));
    System.out.println("toStream().peek()");
    src.toStream()
        .peek(i -> System.out.println("in: " + 1))
        .map(Tap::dup)
        .forEach(i -> System.out.println("out: " + i));
    System.out.println("map(touch())");
    src.map(touch(i -> System.out.println("in: " + i)))
        .map(Tap::dup)
        .forEach(i -> System.out.println("out: " + i));
  }

  public static int dup(int i) {
    return i * 2;
  }
}

The output of this is:

peek()
in: 1
out: 2
out: 4
out: 6
toStream().peek()
in: 1
out: 2
in: 1
out: 4
in: 1
out: 6
map(touch())
in: 1
in: 2
in: 3
out: 2
out: 4
out: 6

The map(touch()) workaround is the behaviour I want. It's also the behaviour of Java's Stream::peek. Since 2.13, Scala has it as well, it's called tapEach(). tapEach() is like forEach(), but it's not a terminal operation. From Scala's documentation:

def tapEach[U](f: (A) => U): Iterable[A] Applies a side-effecting function to each element in this collection. Strict collections will apply f to their elements immediately, while lazy collections like Views and LazyLists will only apply f on each element if and when that element is evaluated, and each time that element is evaluated.

So, the last variant with map(touch()) could now be written as:

System.out.println("tapEach()");
src.tapEach(i -> System.out.println("in: " + i))
    .map(Tap::dup)
    .forEach(i -> System.out.println("out: " + i));

Edit: For the sake of completeness, here's the toStream().map(touch()) variant:

System.out.println("toStream().map(touch())");
src.toStream()
    .map(touch(i -> System.out.println("in: " + i)))
    .map(Tap::dup)
    .forEach(i -> System.out.println("out: " + i));

Output:

toStream().map(touch())
in: 1
out: 2
in: 2
out: 4
in: 3
out: 6

Again, map(touch()) behaves exactly like tapEach() would, but without the overhead of a map() that doesn't change the value.

Abnaxos avatar Jun 15 '21 16:06 Abnaxos

@sleepytomcat this would be a good issue:

interface Traversable<T> {

    // javadoc
    Traversable<T> tapEach(Consumer<? super T> action);

}

It needs to be implemented on each collection

interface Seq<T> extends Traversable<T> {

    @Override
    Seq<T> tapEach(Consumer<? super T> action);

}

interface List<T> extends LinearSeq<T> {

    @Override
    default List<T> tapEach(Consumer<? super T> action) {
        return Collections.tapEach(this, action);
    }

}

// existing utility class
final class Collections {

    static <T, C extends Traversable<T>> C tapEach(C source, Consumer<? super T> action) {
        return source.map(t -> {
            action.accept(t);
            return t;
        });
    }

}

Regarding the tests, I would love to see general AbstractTraversable tests, also for the expected order of side-effects. Maybe we need additional tests for Stream and Iterator (the lazily evaluated structures) because they behave slightly different but I'm not sure...

I would use a javadoc for tapEach along the lines of the Scala one Abnaxos mentioned.

What do you think?

danieldietrich avatar Jul 05 '21 13:07 danieldietrich

let's do it! It is great to sync with Scala. Could you clarify the difference between/define the semantics of peek/tap, forEach/tapEach?

when can we use the v1.0.0? Need for this feature badly~

kgyhkgyh avatar Aug 12 '21 13:08 kgyhkgyh

@danieldietrich @Abnaxos I'm addressing this in https://github.com/vavr-io/vavr/issues/2676, need some clarification on how we actually want to connect Scala and Vavr logic here.

  1. From Scala 2.13 documentation:

def tapEach[U](f: (A) => U): Stream[A] Applies a side-effecting function to each element in this collection. Strict collections will apply f to their elements immediately, while lazy collections like Views and LazyLists will only apply f on each element if and when that element is evaluated, and each time that element is evaluated.

https://www.scala-lang.org/api/2.13.6/scala/collection/immutable/Stream.html#tapEachU:C

Specifically this part: "lazy collections [...] will only apply f on each element if and when that element is evaluated, and each time that element is evaluated".

  1. In Vavr, the rules of if and when that element is evaluated seem to be different from Scala, i.e. io.vavr.collection.Stream does have lazy evaluation for tail only, but the head is evaluated immediately. For example, let's consider the following code:
    vavrStream = List.of(1,2,3).toStream();
    vavrStream.map(x -> {System.out.println("element " + x + " mapped") ; return x + 1;});

— here the output would be:

element 1 mapped

My point is, directly applying Scala logic from (1) to Vavr design (2) would result in

    vavrStream = List.of(1,2,3).toStream();
    vavrStream.tapEach(x -> System.out.println("element " + x + " tapped"));

Output:

element 1 tapped

I believe this is not what we want.

All of the above makes me think we may need to clarify the rules (exception from the rules?) of materializing/evaluating elements of a Stream in Vavr to be able to implement tapEach() without design contradictions.

sleepytomcat avatar Aug 30 '21 04:08 sleepytomcat