parsimonious icon indicating copy to clipboard operation
parsimonious copied to clipboard

Incorrect Error Reporting

Open ducharmemp opened this issue 9 years ago • 0 comments

I'm attempting to use parsimonious to parse the old ANSI file format X12. Currently, I have an example that reports the incorrect position of an error (keep in mind, my grammar was actually wrong but I was looking for a long time in the wrong place):

The grammar: l_isa_loop_isa = ("ISA") elem_sep ("00" / "03") elem_sep (any) elem_sep ("00" / "01") elem_sep (any) elem_sep ("01" / "14" / "20" / "27" / "28" / "29" / "30" / "33" / "ZZ") elem_sep (any) elem_sep ("00501") elem_sep (any) elem_sep ("P" / "T") elem_sep (any) segment_sep

any = ~"[^~]"is elem_sep = "*" segment_sep = "~"

Applied to the string

"ISA00 00 ZZ123456789012345ZZ1234567890123460610151705*>00501000010216*0_T_:~"

Reports that it could not match the "ZZ" on the alternation ("01" / "14" / "20" / "27" / "28" / "29" / "30" / "33" / "ZZ"). However, after stepping through parsimonious with the debugger, I noticed that when the "01" fails to match, the section

if node is None and pos >= error.pos and (self.name or getattr(error.expr, 'name', None) is None):
        error.expr = self
        error.pos = pos

Sets the error, but fails to set the error to the actual error (matching the literal "00501"). I'm currently using python 3.5. The fix could be as simple as this, but I'm not entirely sure if there are edge cases where this would fail.

class OneOf(Compound):
    """A series of expressions, one of which must match
    Expressions are tested in order from first to last. The first to succeed
    wins.
    """
    def _uncached_match(self, text, pos, cache, error):
        cached_pos = error.pos
        for m in self.members:
            node = m.match_core(text, pos, cache, error)
            if node is not None:
                # Wrap the succeeding child in a node representing the OneOf:
                error.expr = None
                error.pos = cached_pos
                return Node(self.name, text, pos, node.end, children=[node])```

ducharmemp avatar Mar 15 '16 19:03 ducharmemp