tree-sitter-c
tree-sitter-c copied to clipboard
Function-like macros containing code do not parse
The following code does not parse correctly, or recover gracefully:
void haunted() {
{
DEBUG_CODE(
if (0) {
a(1);
}
);
}
}
On the playground, this produces the node tree:
[translation_unit](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [9, 0]
[function_definition](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [7, 7]
type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [0, 4]
declarator: [function_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [0, 5] - [0, 14]
declarator: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 5] - [0, 12]
parameters: [parameter_list](https://tree-sitter.github.io/tree-sitter/playground#) [0, 12] - [0, 14]
body: [compound_statement](https://tree-sitter.github.io/tree-sitter/playground#) [0, 15] - [7, 7]
[compound_statement](https://tree-sitter.github.io/tree-sitter/playground#) [1, 5] - [5, 15]
[expression_statement](https://tree-sitter.github.io/tree-sitter/playground#) [2, 10] - [4, 23]
[call_expression](https://tree-sitter.github.io/tree-sitter/playground#) [2, 10] - [4, 22]
function: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [2, 10] - [2, 20]
arguments: [argument_list](https://tree-sitter.github.io/tree-sitter/playground#) [2, 20] - [4, 22]
[ERROR](https://tree-sitter.github.io/tree-sitter/playground#) [3, 14] - [3, 22]
[call_expression](https://tree-sitter.github.io/tree-sitter/playground#) [3, 14] - [3, 20]
function: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [3, 14] - [3, 16]
arguments: [argument_list](https://tree-sitter.github.io/tree-sitter/playground#) [3, 17] - [3, 20]
[number_literal](https://tree-sitter.github.io/tree-sitter/playground#) [3, 18] - [3, 19]
[call_expression](https://tree-sitter.github.io/tree-sitter/playground#) [4, 18] - [4, 22]
function: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [4, 18] - [4, 19]
arguments: [argument_list](https://tree-sitter.github.io/tree-sitter/playground#) [4, 19] - [4, 22]
[number_literal](https://tree-sitter.github.io/tree-sitter/playground#) [4, 20] - [4, 21]
[MISSING )](https://tree-sitter.github.io/tree-sitter/playground#) [4, 22] - [4, 22]
[ERROR](https://tree-sitter.github.io/tree-sitter/playground#) [6, 10] - [6, 11]
[expression_statement](https://tree-sitter.github.io/tree-sitter/playground#) [6, 11] - [6, 12]
[ERROR](https://tree-sitter.github.io/tree-sitter/playground#) [8, 0] - [8, 1]
The solution is probably to use _top_level_item
instead of _expression
inside argument_list
since tree_sitter has no way to know that a particular call-like thing is a macro expansion.
This particular example seems to be extra haunted because tree-sitter makes different choices about where to put the ERROR fill tokens depending on the length of the DEBUG_CODE
macro. If the last E
is removed, there are still errors but tree-sitter recovers "better" and matches the curly braces correctly. Removing the 1
in the argument to a
also fixes the problem.
AFAIK that's invalid c. A macro invocation has to begin and end on the same line, barring \ + \n as that get's replaced even before evaluation.
The definition has to be one line, but 6.10.3.10 of n1256 (C99 working draft) states:
Within the sequence of preprocessing tokens making up an invocation of a function-like macro, new-line is considered a normal white-space character.
So within the invocation, \n counts as a normal whitespace character.