PHP-Parser icon indicating copy to clipboard operation
PHP-Parser copied to clipboard

@file docblock is associated as a comment to the following PHP statement

Open joachim-n opened this issue 7 years ago • 4 comments

<?php\n
\n
/**\n
 * @file\n
 * TODO: Enter file description here.\n
 */\n
\n
/**\n
 * Implements hook_hook_info().\n
 */\n
function testmodule8b_hook_info() {\n

Gets me this:

array:1 [
  0 => PhpParser\Node\Stmt\Function_ {#219
    // SNIP
    #attributes: array:3 [
      "startLine" => 11
      "comments" => array:2 [
        0 => PhpParser\Comment\Doc {#212
          #text: """
            /**\n
             * @file\n
             * TODO: Enter file description here.\n
             */
            """
          #line: 3
          #filePos: 7
        }
        1 => PhpParser\Comment\Doc {#214
          #text: """
            /**\n
             * Implements hook_hook_info().\n
             */
            """
          #line: 8
          #filePos: 63
        }
      ]
      "endLine" => 19
    ]
  }
]

When the file has an import statement:

<?php\n
\n
/**\n
 * @file\n
 * TODO: Enter file description here.\n
 */\n
\n
use Drupal\Core\Routing\RouteMatchInterface;\n
\n
/**\n
 * Implements hook_help().\n
 */\n
function testmodule8b_help($route_name, RouteMatchInterface $route_match) {\n

the AST is:

array:2 [
  0 => PhpParser\Node\Stmt\Use_ {#170
    +type: 1
    +uses: array:1 [
      0 => PhpParser\Node\Stmt\UseUse {#228
        +type: 0
        +name: PhpParser\Node\Name {#214
          +parts: array:4 [
            0 => "Drupal"
            1 => "Core"
            2 => "Routing"
            3 => "RouteMatchInterface"
          ]
          #attributes: array:2 [
            "startLine" => 8
            "endLine" => 8
          ]
        }
        +alias: "RouteMatchInterface"
        #attributes: array:2 [
          "startLine" => 8
          "endLine" => 8
        ]
      }
    ]
    #attributes: array:3 [
      "startLine" => 8
      "comments" => array:1 [
        0 => PhpParser\Comment\Doc {#212
          #text: """
            /**\n
             * @file\n
             * TODO: Enter file description here.\n
             */
            """
          #line: 3
          #filePos: 7
        }
      ]
      "endLine" => 8
    ]
  }
  // ... etc

In both cases, the @file docblock should be treated the same in the AST.

joachim-n avatar Nov 16 '17 22:11 joachim-n

The file doc block will always be on the first AST node in the file. Your two examples differ by what kind of node that is. In either case, you should be able to retrieve it using $stmts[0]->getAttribute('comments')[0] (ignoring checks for all of those existing). (Or, to handle additional comments before it, by looping over $stmts[0]->getAttribute('comments') and finding the doc comment with @file in it.)

I'm not sure how else this can be handled (at least while staying within the current model where a comment is always attached to a node).

nikic avatar Nov 16 '17 22:11 nikic

Indeed, I see the problem.

To me at least, conceptually, a docblock that has spacing either side of it isn't attached to anything.

@file docblocks are one such example, but they are used for other kinds of documentation too, eg in https://github.com/drupal/core/blob/8.5.x/core.api.php

joachim-n avatar Nov 27 '17 20:11 joachim-n

Wanted to chip in with a question: is there a possible technical limitation as the reason that docblocks and comments aren't actually separate nodes? I wanted to ask as 4.0.0 is now in the works and, if a BC break is to happen, this would be an opportune time.

In some scenario's attaching docblocks to the next node is intuitive, such as when a method docblock is present, it obviously belongs to the method below it. In other cases, it is rather awkward, such as in this case where the docblock has no bearing on the next elment at all or just contains general information.

Perhaps introducing a separate node for comments as well as docblocks would provide a solution? The next node could, if backwards compatibility is desired, still contain a reference to that node.

It would also make scenario's such as "find whatever is located at this offset in the file" more intuitive, which could then also include a docblock or comment without having to scan neighbouring nodes.

Gert-dev avatar Dec 03 '17 18:12 Gert-dev

@Gert-dev The main issue is that comments may occur in virtually any position.

/*a*/if/*b*/(/*c*/$bar/*d*/)/*e*/{
    /*f*/call/*g*/(/*h*/$bar/*i*/)/*j*/;
}

It's not really feasible to represent all of these comments as explicit AST nodes, at least not without making the node structure significantly more complicated.

Of course, the current approach of appending comments to the next node is also not able to preserve all the information, as only comments directly preceding a node are stored (a, c, f, h). Unfortunately this also includes the relatively common case of trailing comments with no following node (#384). Additionally, the current implementation has some technical issues, leading to comments being assigned to multiple nodes (#253).

Overall I'm not very happy with the current situation. The current implementation is based on the lexer attribute system, which essentially restricts us to assigning comments to the next node. I have been considering to drop this in favor of a separate visitor that will assign comments based on token offset information. This would provide more flexibility in the way comments are assigned, e.g. by distinguishing leading and trailing comments, and also allowing to preserve information about interior comments that are currently lost entirely. (A downside is that it would require a separate visitor run, which is probably quite a bit slower than assigning comments on the fly.)

Regarding the storage of comments as entirely separate nodes, at least in the cases where it is possible (e.g. statement lists), I agree that this would make more semantic sense in many cases (the direct association with nodes is only really meaningful for doc comments). However, it would also make some basic operations more complicated, e.g. adding a doc comment to a method would require obtaining a reference to the parent structure (e.g. class node), locating the method in the statement list and inserting a doc comment node before it.

nikic avatar Dec 03 '17 20:12 nikic