parsimonious icon indicating copy to clipboard operation
parsimonious copied to clipboard

Raise an exception from the grammar?

Open rowlesmr opened this issue 3 years ago • 3 comments
trafficstars

Hi all

Is there an ability to raise an exception straight from the grammar?

grammar = 
"""
datablockheading  = DATA  blockframecode
DATA = "data_"
blockframecode = nonblankchar+ / RAISE_ERROR
nonblankchar = ~"[A-Za-z0-9]"
"""

If not, what is the best sort of way to accomplish the same behaviour? I'm coming from PEGTL.

rowlesmr avatar Sep 19 '22 02:09 rowlesmr

I never got around to adding fine-grained error reporting to Parsimonious, but the design in my head involved annotated PEG-style cuts. Semantically, they might have been similar to what you suggest here, if I guess the behavior right.

In the meantime, you could define a visitor method called visit_RAISE_ERROR (in this case) and raise an exception from there.

erikrose avatar Sep 19 '22 12:09 erikrose

In the meantime, you could define a visitor method called visit_RAISE_ERROR (in this case) and raise an exception from there.

I don't think that would work since you'd end up with failed parsing where it doesn't consume all the input. E.g. you could define RAISE_ERROR to either consume zero characters or the rest of the string, neither of which would work for some grammars:

from parsimonious import *
g = Grammar(r"""
    parenthesized = "(" addition_expr ")"
    addition_expr = (number "+" number) / RAISE_ERROR
    number = ~"\d+"
    RAISE_ERROR = ~".+"m
""")
g.parse("(...)")  # parsimonious.exceptions.ParseError: Rule 'number' didn't match at '...)' (line 1, column 2).

There the RAISE_ERROR node doesn't match because it greedily consumes the ) at the end.

It's a bit clunky, but one option would be to define your own custom expression type that just raises an error:

from parsimonious.expressions import Expression
class RAISE_ERROR(Expression):
    def _uncached_match(self, text, pos, cache, error):
        raise Exception(f"You messed up at {pos}, 🤦‍♂️.")


g = Grammar(r"""
    parenthesized = "(" addition_expr ")"
    addition_expr = (number "+" number) / RAISE_ERROR
    number = ~"\d+"
""", RAISE_ERROR=RAISE_ERROR())
g.parse("(...)")  # Exception: You messed up at 1, 🤦‍♂️.

lucaswiman avatar Sep 19 '22 20:09 lucaswiman

Good point. So yes, I think your workaround is the best bet for the moment.

erikrose avatar Oct 08 '22 14:10 erikrose