cats-parse
cats-parse copied to clipboard
Tuple semigroupal operations on parsers result in the wider Parser0 type - workarounds?
The original design post contains this quite reasonable statement:
It is interesting to note how Parser1 composes with Parser: if you sequence one of each, you get a Parser1 as a result since once characters are consumed they are never unconsumed.
This is true when you use ~
to compose parsers:
val p1: Parser[A] = ...
val p2: Parser0[B] = ...
val combined1: Parser[(A, B)] = p1 ~ p2
val combined2: Parser[(B, A)] = p2.with1 ~ p1 // can use the with1 workaround if the left parser is Parser0
However, if you use the cats.syntax.contravariantSemigroupal
operations to combine multiple parsers into a single tuple, to avoid working with nested tuples and for somewhat better syntax, this fails to work:
import cats.syntax.contravariantSemigroupal._
val p1: Parser[A] = ...
val p2: Parser0[B] = ...
// fails because p2 is Parser0, necessitating the use of Parser0 instances as they are "wider" in a sense,
// therefore `.tupled` returns `Parser0[(A, B)]`
val combined1: Parser[(A, B)] = (p1, p2).tupled
Is there a workaround for this, except using ~
? The thing is, multiple ~
s in a row result in the parser value being a lot of nested tuples, which doesn't look very nice:
(a ~ b ~ c ~ d ~ e).map {
case ((((x1, x2), x3), x4), x5) =>
...
}
// vs tupled/mapN, if it was possible:
(a, b, c, d, e).mapN {
(x1, x2, x3, x4, x5) =>
...
}
I wonder if something like fastparse's Sequencer mechanism would be a good addition. Based on the type of parsers, it collects non-Unit
values into a single tuple:
val u: Parser[Unit] = ...
(a ~~ u ~~ b ~~ u ~~ c ~~ u ~~ d).map { // assuming a, b, c, d are Parser[<not Unit>]
case (aValue, bValue, cValue, dValue) => ...
}
Not sure if it is adaptable to the Parser/Parser0 distinction, but it might be.
I don't know of a great work around.
We could certainly have something like you suggest. Although, I think the wiping of unit values can be pretty confusing when you have a block of items (since you can't easily see which are unit valued).
I think this is just an area where the semigroupal abstraction falls over.
You could do something hacky like chop off the first Parser
then use p1 ~ (p2, p3, p4).tupled
or something... it would be nice to have a good design.