cats icon indicating copy to clipboard operation
cats copied to clipboard

Added `flatParMapN`

Open TonioGela opened this issue 3 years ago • 13 comments

Hello 👋!

Taking inspiration from https://github.com/typelevel/cats/pull/4009 I added flatParMapN generation in Boilerplate. Let me know if it works for you or if something needs to be changed.

TonioGela avatar Jun 17 '22 17:06 TonioGela

To me it seems that the name should be parFlatMapX rather than flatParMapX...

satorg avatar Jun 17 '22 18:06 satorg

To me it seems that the name should be parFlatMapX rather than flatParMapX...

After a discussion with @armanbilge, I decided that parFlatMapN could have been confusing since it's not flatMap related but it's a mere mapN(...).flatten. It's equivalent to a flatMap just in the 1-arity case.

TonioGela avatar Jun 17 '22 18:06 TonioGela

Right, I initially proposed parFlatMapN but I realized it doesn't make sense.

The whole concept of Parallel is about a relationship between a Monad and its "parallel" Applicative (FlatMap and its parallel Apply). That is to say, there is no "parallel" flatMap and thus a parFlatMap is non-sensical IMO.

However, if you imagine flatMap is named for "flattening a map", then in this case we are "flattening a parMap".

armanbilge avatar Jun 17 '22 18:06 armanbilge

On the second thought... If flatParMapN is just parMapN followed by flatten, what is the advantage we can get by adding it? I mean if we could optimize it somehow even a little bit by avoiding creating interim object... But seems that we just cannot do anything like that and the proposed implementation does exactly the same thing as parMap + flatten.

In other words, having

def foo: F[Foo]
def bar: F[Bar]
def woo: F[Woo]
def baz: F[Baz]

def combineThemAll: (Foo, Bar, Woo, Baz) => F[Result]

then instead of

(foo, bar, woo, baz).parMap(combineThemAll).flatten

we can get

(foo, bar, woo, baz).flatParMap(combineThemAll)

But it comes at price of adding 22+22 new methods.

satorg avatar Jun 18 '22 01:06 satorg

On the second thought... If flatParMapN is just parMapN followed by flatten, what is the advantage we can get by adding it?

I see your point, but how is this any different than https://github.com/typelevel/cats/pull/4009?

armanbilge avatar Jun 18 '22 02:06 armanbilge

The difference is not that huge though, but:

  • #4009 introduces FlatMapArityFunctions which is mixed into FlatMap which is a typeclass and thus can be potentially optimized for some implementations.
  • in contrast to that this PR adds new methods to:
    • ParallelArityFunctions which is mixed into object Parallel as well as
    • final class TupleNParallelOps which are, well, final classes.

Therefore there no way the flatParMap stuff can be optimized in any case.

That said, I am not strictly against this addition... Just to double check that we understand all the pros and cons we're going to get.

satorg avatar Jun 18 '22 02:06 satorg

  • ParallelArityFunctions which is mixed into object Parallel as well as

Ah yes, that's a good point. I only just realized this myself but the implication didn't register.

I was curious why they are defined on the object instead of the trait itself.

armanbilge avatar Jun 18 '22 02:06 armanbilge

I was curious why they are defined on the object instead of the trait itself.

I am curious too, but I can imagine that one of the reasons might have been to have a consistent behaviour between the parMap2..22 functions. Having ParallelArityFunctions mixed with the trait would have let parMap2 and parMap3 have completely different implementations (within the limits of the signature of course).

That said, this might be a reason to not want a FlatParMapArityFunctions trait to be mixed with Parallel, but I'm not sure I'm able to foresee all the implications.

What do you think @satorg?

TonioGela avatar Jun 18 '22 06:06 TonioGela

To bikeshed this a bit, I think parFlatMap might be a better name: the par part of it happens, then we flatMap.

That or maybe parMapFlatten, which at least reads like what is happening: first we parMap then we flatten.

johnynek avatar Jun 18 '22 07:06 johnynek

That or maybe parMapFlatten, which at least reads like what is happening: first we parMap then we flatten.

To keep up with the bikeshedding: I love parMapFlatten in the 1-arity case but both parMapFlattenN and parMapNFlatten (or parMapFlatten2, parMap2Flatten and siblings) are quite unreadable IMO.

TonioGela avatar Jun 18 '22 07:06 TonioGela

I would also like to add one more point on par*-like name. To me it looks like an established convention – if a method name starts with parFooBar then we could assume that it tries to do the same thing as just fooBar but via the Parallel (or NonEmptyParallel) instance. So it was pretty clear and straightforward assumption: map <-> parMap, tupled <-> parTupled, traverse <-> parTraverse and so it.

Now the proposed name flatParMapN breaks this convention and thus it makes its discoverability quite impeded. So I would still vote for making it parFlatMapN because semantically it is the same as flatMapN but just via Parallel.

I bet that most users who want to turn their parMapN(...).tupled expression into a single call will be looking for parFlatMapN name first because they could be aware that the same transition from mapN(...).tupled will be flatMapN.

satorg avatar Jun 21 '22 18:06 satorg

So I would still vote for making it parFlatMapN because semantically it is the same as flatMapN but just via Parallel.

This is the part that I really struggle with. As we know Parallel is a relationship between a Monad and an Applicative.

My intuition for a "par" method is that it delegates to the "parallel" Applicative implementation of that method. For example:

  • parMapN <-> parallel.mapN
  • parProduct <-> parallel.product
  • parAp <-> parallel.ap
  • parReplicateA <-> parallel.replicateA
  • ...
  • parFlatMap <-> parallel.flatMap 🤯 no such thing!

armanbilge avatar Jun 21 '22 19:06 armanbilge

Both arguments are valid ones

I bet that most users who want to turn their parMapN(...).tupled expression into a single call will be looking for parFlatMapN name first because they could be aware that the same transition from mapN(...).tupled will be flatMapN.

I would love thea coherence you're describing here, but in the first place I have to admit that once I read in the changelog about the addition of flatMapN I needed a minute or 2 to realize that it was a mere shortcut for mapN(...).flatten.

At first, I began searching for the implementation to check whether it was implemented like (M[A], M[B]) => M[A].flatMap(a => M[B].flatMap(...)) (so sequentially) and it was there that I had the idea to add a parMapN(...).flatten alias.

From a certain perspective, this episode might demonstrate that having flatMapN and parFlatMapN side by side would be less confusing, in particular given the fact that a method requires NonEmptyParallel[M] while the other a FlatMap[F].

parFlatMap <-> parallel.flatMap 🤯 no such thing!

The problem IMHO arises here, in the 1-arity case. I suppose that the first impression of a lot of people in front of a parFlatMap would be a "WTH? I just understood that monads are a great tool to encode sequential computation, how can a parallel flatMap exist?"

That said, I would like to gather further opinions, we are just 4 people discussing it here and maybe someone else might have an enlightening idea on the right name to pick. What do you think?

TonioGela avatar Jun 22 '22 07:06 TonioGela

👋 hi, person-who-raised-the-issue-against-mouse here, I never even considered googling for flatParMap - it's down to taste I guess, but to me parFlatMapN makes most sense (also, if IDE support becomes good, I can go .par... and my IDE can suggest all the parallel methods which would be nice)

TobiasRoland avatar Sep 13 '22 11:09 TobiasRoland

Thanks both for chiming in. It seems everyone likes parFlatMapN ... but I still disagree. Particularly about it being a matter of taste, rather it is simply wrong and not following Cats conventions. There is no Parallel version of .flatMap.

Can anyone at least respond specifically to my point in https://github.com/typelevel/cats/pull/4243#issuecomment-1162201748? It's okay if the answer is "I don't care because I want IDE completion" 😂

armanbilge avatar Sep 13 '22 11:09 armanbilge

Can anyone at least respond specifically to my point in https://github.com/typelevel/cats/pull/4243#issuecomment-1162201748?

Alright, you asked for it! 😆 I think you described rather implementation details than the actual naming convention/contract of API:

  • parMapN <-> parallel.mapN
  • parProduct <-> parallel.product
  • parAp <-> parallel.ap
  • parReplicateA <-> parallel.replicateA
  • ...
  • parFlatMap <-> parallel.flatMap 🤯 no such thing!

What's more important for user – one will get the parallel (concurrent actually) behavior of execution multiple effects using parFlatMapN <-> parMapN(...).flatten. Prefix par here is for highlighting that behavior, as for me. It's also pretty obvious – executing the underlying effect will go after executing the batch of effects. It comes from the contract straightforwardly. And I agree with @TobiasRoland that better support of IDE is a killer feature.

danicheg avatar Sep 13 '22 12:09 danicheg

I think you described rather implementation details than the actual naming convention/contract of API:

Ha, okay, fair point. I guess it's just a coincidence that all the existing implementation follow this pattern so far 😉 I do think it would be a good convention to have.

Personally I also think that:

map(...).flatten <-> flatMap(...)

screams that:

parMapN(...).flatten <-> flatParMapN(...)

Because this seemed like a convention too. But maybe it's just an implementation detail all along :)


It's also pretty obvious – executing the underlying effect will go after executing the batch of effects. It comes from the contract straightforwardly.

I thought about this for a while and I agree. I think it's pretty easy to see what parFlatMapN does based on an understanding of par and its method signature. And it is certainly better for IDE completion, but I'm not convinced this is a good argument to use for Cats.

I would prefer we establish a convention here based on the existing implementations and merge this PR as-is, but I won't stand in the way.

Thanks for the response :)

armanbilge avatar Sep 13 '22 15:09 armanbilge

Okay, I'll add one link to strengthen my hand in this negotiation (this trick is also known as argumentum ad verecundiam).

parFlatMap :: Strategy [b] -> (a -> [b]) -> [a] -> [b] Uses parMap to apply a list-valued function to each element of a list in parallel, and concatenates the results.

https://hackage.haskell.org/package/parallel-1.1.0.0/docs/Control-Parallel-Strategies.html#v:parFlatMap

danicheg avatar Sep 13 '22 16:09 danicheg

LOL 🏳️

armanbilge avatar Sep 13 '22 16:09 armanbilge

So as a bottomline.. Did I get it correctly, that we all agreed on the parFlatMapN naming?

satorg avatar Sep 13 '22 19:09 satorg

I wouldn't say "agree" but I am satisfied to blame haskell for this quirk 😝

armanbilge avatar Sep 13 '22 19:09 armanbilge

I wouldn't say "agree" but I am satisfied to blame haskell for this quirk 😝

Okay, I can modify the name to finally merge the PR so? :D

[EDIT] I'm okay to blame Haskell for that name 😇

TonioGela avatar Sep 17 '22 19:09 TonioGela

Looks good to me, thanks. I am personally ok to have it merged. Just wondering – should we await a little bit to hear from other guys who participated in the discussion? @danicheg @johnynek – any objections/concerns?

I re-read the whole discussion and it really seem that both @danicheg and @johnynek are great supporters of this new naming 😄

TonioGela avatar Sep 17 '22 20:09 TonioGela

made a comment, take or leave it.

Answered in the single discussions, thanks for the review

TonioGela avatar Oct 07 '22 07:10 TonioGela

Thank you for your efforts in putting this to the finish. Great work!

image

TonioGela avatar Oct 07 '22 07:10 TonioGela