tree-sitter-c
tree-sitter-c copied to clipboard
Cast using non-standard type leads to call_expression
The following three cases create cast_expression
nodes.
enum {
a = ((char)(2))
};
enum {
a = ((char)2)
};
enum {
a = ((abc)2)
};
However, if using a custom type and using ()
, we end up with a call_expression
which is wrong.
enum {
a = ((abc)(2))
};
# (enum_specifier 42 1 25
# (enumerator_list 42 6 20
# (enumerator 43 3 14
# (identifier 43 3 1 "a")
# (parenthesized_expression 43 7 10 "((abc)(2))"
# (call_expression 43 8 8 "(abc)(2)"
# (parenthesized_expression 43 8 5 "(abc)"
# (identifier 43 9 3 "abc")
# )
# (argument_list 43 13 3 "(2)"
# (number_literal 43 14 1 "2")
# )
# )
# )
# )
# )
# )
There are other similar situations where the parser assumes that abc
is an identifier and not a type:
a = (abc)*x;
a = (abc)&x;
a = (abc)-x;
The problem is that C grammar is ambiguous here, confounding syntax and semantic. For example, the first one can be either a cast followed by a dereference, or a multiplication, purely based on whether abc
is a type or a variable. The parser should know if abc
has been previously typedef'd, and choose the correct interpretation. But apparently it doesn't.
I'm sorry but yeah C is too ambiguous here. One solution is to just somewhat follow convention where types are oftentimes PascalCase identifiers and variables snake_case or camelCase, this can easily be done with queries.