PHP-Parser icon indicating copy to clipboard operation
PHP-Parser copied to clipboard

Lost comment on the last item of an array

Open j-d opened this issue 8 years ago • 2 comments

Hello,

<?php

$a = [
    1, // Comment 1
    2, // Comment 2
];

When parsing the above example, the Comment 2 is lost. I think it is because it would be stored against a potential third item on the array. It would be great if there was a way to recover it.

I know it is lost because I have done a dump of the serialization of the parsed statements, and it is not there.

This is the json dump, if it helps:

[
  {
    "nodeType": "Expr_Assign",
    "var": {
      "nodeType": "Expr_Variable",
      "name": "a",
      "attributes": {
        "startLine": 3,
        "endLine": 3
      }
    },
    "expr": {
      "nodeType": "Expr_Array",
      "items": [
        {
          "nodeType": "Expr_ArrayItem",
          "key": null,
          "value": {
            "nodeType": "Scalar_LNumber",
            "value": 1,
            "attributes": {
              "startLine": 4,
              "endLine": 4,
              "kind": 10
            }
          },
          "byRef": false,
          "attributes": {
            "startLine": 4,
            "endLine": 4
          }
        },
        {
          "nodeType": "Expr_ArrayItem",
          "key": null,
          "value": {
            "nodeType": "Scalar_LNumber",
            "value": 2,
            "attributes": {
              "startLine": 5,
              "comments": [
                {
                  "nodeType": "Comment",
                  "text": "\/\/ Comment 1\n",
                  "line": 4,
                  "filePos": 21
                }
              ],
              "endLine": 5,
              "kind": 10
            }
          },
          "byRef": false,
          "attributes": {
            "startLine": 5,
            "comments": [
              {
                "nodeType": "Comment",
                "text": "\/\/ Comment 1\n",
                "line": 4,
                "filePos": 21
              }
            ],
            "endLine": 5
          }
        }
      ],
      "attributes": {
        "startLine": 3,
        "endLine": 6,
        "kind": 2
      }
    },
    "attributes": {
      "startLine": 3,
      "endLine": 6
    }
  }
]

Thanks for your great work.

j-d avatar May 02 '17 13:05 j-d

When parsing the above example, the Comment 2 is lost. I think it is because it would be stored against a potential third item on the array. It would be great if there was a way to recover it.

This is indeed the case. PHP-Parser always associates comments with the following node -- if there is no following node, they are lost.

However, there is a way to manually retrieve such comments with a bit of extra work. You need to enable token positions as described in the lexer docs and obtain the tokens with a $lexer->getTokens() call.

Then you should be able to retrieve trailing comments like these using something like the following code:

function getTrailingComment(array $tokens, Node $node) {
    assert($node->hasAttribute('endTokenPos'));

    $pos = $node->getAttribute('endTokenPos');
    $endLine = $node->getAttribute('endLine');

    for (; $pos < count($tokens); ++$pos) {
        if (!is_array($tokens[$pos])) continue;
        list($type, $content, $line) = $tokens[$pos];
        if ($line > $endLine) break;
        if ($type === T_COMMENT || $type === T_DOC_COMMENT) {
            return $content;
        }
    }

    return null;
}

This code will return the first comment after the node that is still on the same line.

nikic avatar May 06 '17 17:05 nikic

When I was processing line comments, the way they are assigned to the next statement, I thought, isn't the expected way to factor this. Wouldn't it be better if they are a statement/expression on their own?

stanvass avatar Jul 04 '17 01:07 stanvass