nimble_parsec icon indicating copy to clipboard operation
nimble_parsec copied to clipboard

Add `expect` combinator

Open tmbb opened this issue 6 years ago • 6 comments

As discussed here, NimbleParsec should have an expect combinator for better error reporting. I suggest something like this: expect(previous \\ [], expected). This combinator would raise an error is expected didn't match.

My use case: I'm currently working on a parser for the ICU Message Format. Currently, NimbleParsec backtracks so much that I get rather useless error messages. Suppose we have something like this:

{variable, plural, one {...} two {...} abcd {...}}

The above is invalid, because abcd is not a supported option for the plural argument type. I already know it won't be supported as soon as I parse {variable, plural, , and I'd like to emit an error as soon as I find the abcd option. But currently there is no way to do this... NimbleParsec will just backtrack and fail with an error message that doesn't really help anyone (it actually says it expects an end of string on character 0...).

An expect combinator would allow me to emit the correct error message.

tmbb avatar Jan 21 '19 23:01 tmbb

So one option is to use label to wrap the whole construct and you should get better error messages.

If that doesn’t work, I believe expect can be implemented with optional(lookahead_not(combinator) |> post_traverse({mfa_that_returns_error_reason}).

My suggestion would be to add an error(reason) combinator, which should be quite straightforward. A PR is appreciated!

José Valim www.plataformatec.com.br Skype: jv.ptec Founder and Director of R&D

josevalim avatar Jan 21 '19 23:01 josevalim

To clarify, the error combinator is just a shortcut to post_traverse(mfa_that_fails_with_error_reason).

José Valim www.plataformatec.com.br Skype: jv.ptec Founder and Director of R&D

josevalim avatar Jan 21 '19 23:01 josevalim

Isn't that the fail combinator that already exists? I don't think what you describe can abort parsing early

tmbb avatar Jan 21 '19 23:01 tmbb

I think we don’t have a fail combinator?

But I see your point, returning error will just cause the whole choice to fail.

Maybe you can implement it like this:

choice([lookahead(combinator)]) |> label(...)

This what if the lookahead fails, the whole choice will fail with the given label.

José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D

josevalim avatar Jan 21 '19 23:01 josevalim

Closing this for now. Please send a PR if it is still desired!

josevalim avatar May 18 '19 12:05 josevalim

I'd like to re-open this issue (although I still don't have time to fully understand the nimble_parsec internals and write it myself):

I have written a sigil to handle units of length. It should parse (among other things) sums of numbers follow by valid units. For example, it should parse 6in + 4pt. It should return an error if the unit is not among the valid units. For example (error message from nimble_parsec at the bottom of the error message; the message is generated from a label):

iex(35)> ~L[8pxt]      
** (Playfair.Length.ParserError) iex:35

    Error while trying to parse "8pxt"

        8pxt
         ───

      ** expected unit of measurement (%, mm, cm, pt, in, em, fr) while processing term

    (playfair 0.1.0) expanding macro: Playfair.Length.sigil_L/2
    iex:35: (file)

The above is great! The user gets proper feedback on which units to use thanks to the label combinator. However, if the expression is more complicated:

iex(35)> ~L[9in + 8pxt]
** (Playfair.Length.ParserError) iex:35

    Error while trying to parse "9in + 8pxt"

        9in + 8pxt
            ──────

      ** expected end of string

    (playfair 0.1.0) expanding macro: Playfair.Length.sigil_L/2
    iex:35: (file)

The above error message is not good. Nimble parsec tries a number of combinators in turn, and the last thing it attempts to match is a number followed by a valid unit followed by the end of the string. But what I really wanted was for the error to be at after the 8 and before the invalid unit (pxt). This feature would allow me to "bubble up" the error message into the user and gave the parser fail at the right spot.

tmbb avatar Jul 04 '23 09:07 tmbb