tree-sitter-c icon indicating copy to clipboard operation
tree-sitter-c copied to clipboard

Function-like macros containing code do not parse

Open sapphire-arches opened this issue 2 years ago • 2 comments

The following code does not parse correctly, or recover gracefully:

void haunted() {
     {
          DEBUG_CODE(
              if (0) {
                  a(1);
              }
          );
      }
}

On the playground, this produces the node tree:

[translation_unit](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [9, 0]
  [function_definition](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [7, 7]
    type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [0, 4]
    declarator: [function_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [0, 5] - [0, 14]
      declarator: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 5] - [0, 12]
      parameters: [parameter_list](https://tree-sitter.github.io/tree-sitter/playground#) [0, 12] - [0, 14]
    body: [compound_statement](https://tree-sitter.github.io/tree-sitter/playground#) [0, 15] - [7, 7]
      [compound_statement](https://tree-sitter.github.io/tree-sitter/playground#) [1, 5] - [5, 15]
        [expression_statement](https://tree-sitter.github.io/tree-sitter/playground#) [2, 10] - [4, 23]
          [call_expression](https://tree-sitter.github.io/tree-sitter/playground#) [2, 10] - [4, 22]
            function: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [2, 10] - [2, 20]
            arguments: [argument_list](https://tree-sitter.github.io/tree-sitter/playground#) [2, 20] - [4, 22]
              [ERROR](https://tree-sitter.github.io/tree-sitter/playground#) [3, 14] - [3, 22]
                [call_expression](https://tree-sitter.github.io/tree-sitter/playground#) [3, 14] - [3, 20]
                  function: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [3, 14] - [3, 16]
                  arguments: [argument_list](https://tree-sitter.github.io/tree-sitter/playground#) [3, 17] - [3, 20]
                    [number_literal](https://tree-sitter.github.io/tree-sitter/playground#) [3, 18] - [3, 19]
              [call_expression](https://tree-sitter.github.io/tree-sitter/playground#) [4, 18] - [4, 22]
                function: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [4, 18] - [4, 19]
                arguments: [argument_list](https://tree-sitter.github.io/tree-sitter/playground#) [4, 19] - [4, 22]
                  [number_literal](https://tree-sitter.github.io/tree-sitter/playground#) [4, 20] - [4, 21]
              [MISSING )](https://tree-sitter.github.io/tree-sitter/playground#) [4, 22] - [4, 22]
      [ERROR](https://tree-sitter.github.io/tree-sitter/playground#) [6, 10] - [6, 11]
      [expression_statement](https://tree-sitter.github.io/tree-sitter/playground#) [6, 11] - [6, 12]
  [ERROR](https://tree-sitter.github.io/tree-sitter/playground#) [8, 0] - [8, 1]

The solution is probably to use _top_level_item instead of _expression inside argument_list since tree_sitter has no way to know that a particular call-like thing is a macro expansion.

This particular example seems to be extra haunted because tree-sitter makes different choices about where to put the ERROR fill tokens depending on the length of the DEBUG_CODE macro. If the last E is removed, there are still errors but tree-sitter recovers "better" and matches the curly braces correctly. Removing the 1 in the argument to a also fixes the problem.

sapphire-arches avatar Jun 17 '22 22:06 sapphire-arches

AFAIK that's invalid c. A macro invocation has to begin and end on the same line, barring \ + \n as that get's replaced even before evaluation.

dan1338 avatar Jul 03 '22 15:07 dan1338

The definition has to be one line, but 6.10.3.10 of n1256 (C99 working draft) states:

Within the sequence of preprocessing tokens making up an invocation of a function-like macro, new-line is considered a normal white-space character.

So within the invocation, \n counts as a normal whitespace character.

sapphire-arches avatar Jul 04 '22 07:07 sapphire-arches