instaparse icon indicating copy to clipboard operation
instaparse copied to clipboard

Improve error message when CFG is malformed due to a missing closing quote or paren.

Open Engelberg opened this issue 11 years ago • 7 comments

Engelberg avatar Apr 10 '13 18:04 Engelberg

Let me first thank you for such a great thing you've published with Instaparse. I've been wanting to learn (E/A)BNF and language parsing for a while but I found it hard to get going with it: instaparse is a breath of fresh air - just awesome, thanks for making this possible for me (and others :)

Now on topic: Is it perhaps some case such as I've got illustrated in the screenshot? I've tried simplifying the expression under the assumption it was either with 'eps' or without, '|' | '¦' are referenced as the location but fairly obvious it is somewhere else:

image

Also it's become apparent that the line numbers reference the CFG string line numbers, and not those in the file, but only because I did check that and I'm working inline... should this be a file with a parser in it, it may not be so obvious - not that I suggest it should be changed but something worth considering and/or mentioning I guess.

So anyway, to show (as you'll know) that the referenced line/column position is not where the fault seems to be and that the eps is not truely required just conventional (the message says on multiple occasions that it expects eps-like symbols. Perhaps this could be clarified a bit as well?

image

Furthermore, but slightly off-topic, I don't really get why the equal = sign as a token needs to be escaped.. but ok.

edit: I now notice something off with the quotes on the working example a = "|" | "¦' notice the closing quote single and the rest double? Could this be something like a bug or is this expected?

clojens avatar Aug 14 '14 20:08 clojens

So the most common cause for this sort of error message (an error at the very end of the parser description) is that a quote is either added or missing somewhere along the line, so everything is off and it gets to the end and can't make sense out of it.

For example,

S = 'a'' | 'b'

or

S = 'a | 'b'

are problematic (or similarly with parens).

Ideally, I'd like to provide a better error message for these sorts of cases, because knowing that it failed at the very end doesn't really help all that much to track it down.

So double-check that all your quotes are well-matched. If you still have a problem, post here your full grammar. I see from the picture that you are using a lot of unusual punctuation marks, maybe one of them is throwing off the parser in a way that I haven't accounted for, so I can investigate that.

I'm not sure what you mean about the = sign needing to be escaped.

The way your vertical parser prints does look unexpected. If there is a bug going on there, I doubt it is a bug with the underlying way the parser works, simply with how it prints. I can look into that as well.

Engelberg avatar Aug 15 '14 00:08 Engelberg

I am unable to replicate the printing behavior you saw with your vertical parser printing as a = "|" | "¦' Perhaps this is a problem with LightTable?

Engelberg avatar Aug 19 '14 01:08 Engelberg

Were you able to spot any unmatched quotation marks in your parser definition which would explain the error you were getting?

Engelberg avatar Aug 19 '14 01:08 Engelberg

I got something similar with this:

(def parse-uri
  (insta/parser
   "url = [www] domain [path]
    www = 'www.'
    domain = #'[^.]+\\.[^.]+'
    path = backslash (token [backslash])*
    backslash = '\\'
    token = 'something'"))

and it seems to come from the inclusion of the backslash rule, because if I replace with with just regular letters, it compiles fine. Could it be getting confused and thinking it's an escaped quote?

neverfox avatar Jun 15 '15 06:06 neverfox

The general rule of thumb is that if you are expressing your grammar in a Clojure string (as opposed to storing it in a separate file) you have to double every backslash because Clojure's reader collapses ever pair of backslashes down to one.

In a regular Clojure string, a backslash would be "\\".

Therefore, in a parser specification, you need to double each of those backslashes, so your rule should be:

backslash = '\\\\'

'\\' would work if you were storing it in a file, and slurping it in.

Certainly, though, this is the kind of thing that would be helped by a better error message, so I hope to add that at some point.

Engelberg avatar Jun 15 '15 07:06 Engelberg

Ah, of course. Looking at the OP's code, it seems like that was his problem too.

neverfox avatar Jun 17 '15 00:06 neverfox