tree-sitter-c
tree-sitter-c copied to clipboard
Wrong precedences for sizeof and conditional operator
The parse trees for certain expressions involving sizeof <expr>
and the ternary conditional operator are wrong. Consider the following examples:
For sizeof
, this...
void f(void) {
int a;
a = sizeof a + 1;
}
...is parsed as...
[...]
(assignment_expression [2, 4] - [2, 20]
left: (identifier [2, 4] - [2, 5])
right: (sizeof_expression [2, 8] - [2, 20]
value: (binary_expression [2, 15] - [2, 20]
left: (identifier [2, 15] - [2, 16])
right: (number_literal [2, 19] - [2, 20]))))))))
...whereas this would be expected:
[...]
(assignment_expression [2, 4] - [2, 20]
left: (identifier [2, 4] - [2, 5])
right: (binary_expression [2, 8] - [2, 20]
left: (sizeof_expression [2, 8] - [2, 16]
value: (identifier [2, 15] - [2, 16]))
right: (number_literal [2, 19] - [2, 20])))))))
And for the ternary operator, this...
void f(void) {
int a;
a = 0 ? 1 : 2;
}
...is parsed as...
[...]
(conditional_expression [2, 4] - [2, 17]
condition: (assignment_expression [2, 4] - [2, 9]
left: (identifier [2, 4] - [2, 5])
right: (number_literal [2, 8] - [2, 9]))
consequence: (number_literal [2, 12] - [2, 13])
alternative: (number_literal [2, 16] - [2, 17]))))))
...whereas this would be expected:
[...]
(assignment_expression [2, 4] - [2, 17]
left: (identifier [2, 4] - [2, 5])
right: (conditional_expression [2, 8] - [2, 17]
condition: (number_literal [2, 8] - [2, 9])
consequence: (number_literal [2, 12] - [2, 13])
alternative: (number_literal [2, 16] - [2, 17])))))))
I would suggest changing the precedences in grammar.js
to the following:
const PREC = {
PAREN_DECLARATOR: -10,
ASSIGNMENT: -2, // from -1
CONDITIONAL: -1, // from -2
DEFAULT: 0,
LOGICAL_OR: 1,
LOGICAL_AND: 2,
INCLUSIVE_OR: 3,
EXCLUSIVE_OR: 4,
BITWISE_AND: 5,
EQUAL: 6,
RELATIONAL: 7,
SHIFT: 8, // from 9
ADD: 9, // from 10
MULTIPLY: 10, // from 11
CAST: 11, // from 12
SIZEOF: 12, // from 8
UNARY: 13,
CALL: 14,
FIELD: 15,
SUBSCRIPT: 16
};
Edit: Apparently my suggested fix breaks other examples, e.g.
char f(int);
int main(void) {
return sizeof(f)(42);
}
(which seems odd to me since the relation between the precedence of SIZEOF
and CALL
is not changed by the fix.)
Well, according to specification sizeof must only accept unary expr and type names.
While in grammar sizeof uses general $._expression
, but only subset of it is needed:
https://github.com/tree-sitter/tree-sitter-c/blob/5aa0bbbfc41868a3727b7a89a90e9f52e0964b2b/grammar.js#L808-L814
Should grammar contain $._unary_expression = choice(...)
that will contain highlighted expressions?
Or is it intentionally accepts all expressions, so parser can recover from error?
thx @philipturnbull for the fix!