PPI icon indicating copy to clipboard operation
PPI copied to clipboard

PPI::Statement::Variable too greedy

Open moregan opened this issue 11 years ago • 1 comments

ppidump 'open( my $fh, ">", $filename );'
                    PPI::Document
                      PPI::Statement
[    1,   1,   1 ]     PPI::Token::Word         'open'
                        PPI::Structure::List    ( ... )
                          PPI::Statement::Variable
[    1,   7,   7 ]         PPI::Token::Word     'my'
[    1,  10,  10 ]         PPI::Token::Symbol   '$fh'
[    1,  13,  13 ]         PPI::Token::Operator         ','
[    1,  15,  15 ]         PPI::Token::Quote::Double    '">"'
[    1,  18,  18 ]         PPI::Token::Operator         ','
[    1,  20,  20 ]         PPI::Token::Symbol   '$filename'
[    1,  31,  31 ]     PPI::Token::Structure    ';'

variables() on the PPI::Statement::Variable returns just '$fh', which seems right to me, but it doesn't seem right that anything following '$fh' is part of the statement.

It also doesn't seem right that initializers for declared variables become part of the PPI::Statement::Variable:

ppidump 'my $x = 1;';
                    PPI::Document
                      PPI::Statement::Variable
[    1,   1,   1 ]     PPI::Token::Word         'my'
[    1,   4,   4 ]     PPI::Token::Symbol       '$x'
[    1,   7,   7 ]     PPI::Token::Operator     '='
[    1,   9,   9 ]     PPI::Token::Number       '1'
[    1,  10,  10 ]     PPI::Token::Structure    ';'

The absence of a facility like initializers() in PPI::Statement::Variables implies (to me) that including the initializer is not by design.

In playing around with Lexer.pm I found it pretty easy to have a variable declaration without parens stop after it sees the variable:

+               if ( $Statement->isa('PPI::Statement::Variable') ) {
+                       my @schildren = $Statement->schildren();
+                       if ( @schildren > 1 and !$schildren[1]->isa('PPI::Structure::List') ) {
+                               return $self->_rollback( $Token );
+                       }
+               }

but from the results it looks like that change is too naive:

ppidump 'open( my $fh, ">", $filename );'
                    PPI::Document
                      PPI::Statement
[    1,   1,   1 ]     PPI::Token::Word         'open'
                        PPI::Structure::List    ( ... )
                          PPI::Statement::Variable
[    1,   7,   7 ]         PPI::Token::Word     'my'
[    1,  10,  10 ]         PPI::Token::Symbol   '$fh'
                          PPI::Statement::Expression
[    1,  13,  13 ]         PPI::Token::Operator         ','
[    1,  15,  15 ]         PPI::Token::Quote::Double    '">"'
[    1,  18,  18 ]         PPI::Token::Operator         ','
[    1,  20,  20 ]         PPI::Token::Symbol   '$filename'
[    1,  31,  31 ]     PPI::Token::Structure    ';'

I don't know enough about the lexing/parsing to know whether it's just a case of needing a little more logic at statement end/statement begin, whether it's a fundamental problem of a variable declaration being an expression, or what.

moregan avatar Jan 20 '14 07:01 moregan

The issue of what to do after a statement-that-is-really-an-expression ends with more to parse seem similar to https://rt.cpan.org/Ticket/Display.html?id=36555

moregan avatar Jan 30 '14 03:01 moregan