packcc
packcc copied to clipboard
How to do good syntax error handling?
Hi, I can use special rules to catch common errors and point out which row they occur on. I keep track of rows and store it in auxil:
_ <- (WS / Comments)*
__ <- (WS / Comments)+
WS <- [ \t\r\n] {
if ($0[0] == '\n') {
auxil->row++;
}
}
Comments <- SingleLineComment / BlockComment
SingleLineComment <- "//" (!EOL .)* EOL?
EOL <- ("\r\n" / "\n" / "\r") { auxil->row++; }
BlockComment <- "/*" (BlockCommentContent / EOL)* "*/"
BlockCommentContent <- (!("*/" / EOL) .)
I can then use a special rule to catch a common error, e.g.
Block <- e:Expr { $$ = CN(BLOCK, 1, e); } ( _ CommaSeparator _ e:Expr { AC($$, e); })*
CommaSeparator <- ("," / ";") {
if (strcmp($0, ";") == 0) {
fprintf(stderr, "%d: Use ',' to separate expressions in blocks", auxil->row);
}
}
But with unexpected syntax errors everything breaks down and I cannot point out which row the error occured on.
As a workaround I added the following:
static int ROW = 1;
static int satie_getchar(satie_auxil_t* _auxil) {
int c = getchar();
if (c == '\n') {
ROW++;
}
return c;
}
static void satie_error(satie_auxil_t* auxil) {
panic("Syntax error near line %d", ROW);
}
It works and I have re-invented awk-like error handling. :-) It's crude though.
Ideally I would like to point out syntax errors very precisely with both row and column info.
I haven't been able to figure out how to do that? Any hints?
Cheers /Joakim
The example TinyC might be helpful to find the solution for precise counting rows and columns.
It uses the customized macro PCC_GETCHAR()
with the text reader function system__read_source_file()
. In this function, line break positions in bytes are recorded by calling append_line_head_()
while fetching byte characters from an input text. The parsing positions in the input text can be detected using the predefined variables $0s
and $0e
(see README.md). The row number and the column number are computed in the function compute_line_and_column_()
using line break positions and the parsing position. If not supporting multibyte characters, the code below
count_characters_(obj->source.text.p, obj->source.line.p[i - 1], pos) + 1
can be simplified with
pos - obj->source.line.p[i - 1] + 1
Unless considering multibyte characters, the input text needn't be memorized as the example does. Regarding error reporting, the example does it like this using system__handle_syntax_error()
.
@joagre , I'm wondering if my answer was what you wanted. If not so, let me know it. I'll close this issue in a week if no reply. Feel free to reopen it when you need.