cats-parse icon indicating copy to clipboard operation
cats-parse copied to clipboard

Tuple semigroupal operations on parsers result in the wider Parser0 type - workarounds?

Open netvl opened this issue 3 years ago • 2 comments

The original design post contains this quite reasonable statement:

It is interesting to note how Parser1 composes with Parser: if you sequence one of each, you get a Parser1 as a result since once characters are consumed they are never unconsumed.

This is true when you use ~ to compose parsers:

val p1: Parser[A] = ...
val p2: Parser0[B] = ...

val combined1: Parser[(A, B)] = p1 ~ p2
val combined2: Parser[(B, A)] = p2.with1 ~ p1  // can use the with1 workaround if the left parser is Parser0

However, if you use the cats.syntax.contravariantSemigroupal operations to combine multiple parsers into a single tuple, to avoid working with nested tuples and for somewhat better syntax, this fails to work:

import cats.syntax.contravariantSemigroupal._

val p1: Parser[A] = ...
val p2: Parser0[B] = ...

// fails because p2 is Parser0, necessitating the use of Parser0 instances as they are "wider" in a sense,
// therefore `.tupled` returns `Parser0[(A, B)]`
val combined1: Parser[(A, B)] = (p1, p2).tupled

Is there a workaround for this, except using ~? The thing is, multiple ~s in a row result in the parser value being a lot of nested tuples, which doesn't look very nice:

(a ~ b ~ c ~ d ~ e).map {
  case ((((x1, x2), x3), x4), x5) =>
    ...
}

// vs tupled/mapN, if it was possible:

(a, b, c, d, e).mapN {
  (x1, x2, x3, x4, x5) =>
    ...
}

netvl avatar Jun 03 '21 09:06 netvl

I wonder if something like fastparse's Sequencer mechanism would be a good addition. Based on the type of parsers, it collects non-Unit values into a single tuple:

val u: Parser[Unit] = ...

(a ~~ u ~~ b ~~ u ~~ c ~~ u ~~ d).map {  // assuming a, b, c, d are Parser[<not Unit>]
  case (aValue, bValue, cValue, dValue) => ...
}

Not sure if it is adaptable to the Parser/Parser0 distinction, but it might be.

netvl avatar Jun 03 '21 09:06 netvl

I don't know of a great work around.

We could certainly have something like you suggest. Although, I think the wiping of unit values can be pretty confusing when you have a block of items (since you can't easily see which are unit valued).

I think this is just an area where the semigroupal abstraction falls over.

You could do something hacky like chop off the first Parser then use p1 ~ (p2, p3, p4).tupled or something... it would be nice to have a good design.

johnynek avatar Jun 03 '21 20:06 johnynek