PHP_CodeSniffer icon indicating copy to clipboard operation
PHP_CodeSniffer copied to clipboard

Tokenizer doesn't include new line chars in "length"

Open jrfnl opened this issue 3 years ago • 0 comments

The following code sample:

<?php

    // comment.
    function foo() {}

... will tokenize as follows:

Ptr | Ln | Col  | Cond | ( #) | Token Type                 | [len]: Content
-------------------------------------------------------------------------
  0 | L1 | C  1 | CC 0 | ( 0) | T_OPEN_TAG                 | [  5]: <?php

  1 | L2 | C  1 | CC 0 | ( 0) | T_WHITESPACE               | [  0]:

  2 | L3 | C  1 | CC 0 | ( 0) | T_WHITESPACE               | [  4]: ⸱⸱⸱⸱
  3 | L3 | C  5 | CC 0 | ( 0) | T_COMMENT                  | [ 11]: // comment.

  4 | L4 | C  1 | CC 0 | ( 0) | T_WHITESPACE               | [  4]: ⸱⸱⸱⸱
  5 | L4 | C  5 | CC 0 | ( 0) | T_FUNCTION                 | [  8]: function
  6 | L4 | C 13 | CC 0 | ( 0) | T_WHITESPACE               | [  1]: ⸱
  7 | L4 | C 14 | CC 0 | ( 0) | T_STRING                   | [  3]: foo
  8 | L4 | C 17 | CC 0 | ( 0) | T_OPEN_PARENTHESIS         | [  1]: (
  9 | L4 | C 18 | CC 0 | ( 0) | T_CLOSE_PARENTHESIS        | [  1]: )
 10 | L4 | C 19 | CC 0 | ( 0) | T_WHITESPACE               | [  1]: ⸱
 11 | L4 | C 20 | CC 0 | ( 0) | T_OPEN_CURLY_BRACKET       | [  1]: {
 12 | L4 | C 21 | CC 0 | ( 0) | T_CLOSE_CURLY_BRACKET      | [  1]: }
 13 | L4 | C 22 | CC 0 | ( 0) | T_WHITESPACE               | [  0]:

Looking at the above, raised some questions for me regarding the length provided in the token array as it does not seem to include new line characters, Is this intentional ?

jrfnl avatar May 30 '22 09:05 jrfnl