PPI::Statement::Variable too greedy
ppidump 'open( my $fh, ">", $filename );'
PPI::Document
PPI::Statement
[ 1, 1, 1 ] PPI::Token::Word 'open'
PPI::Structure::List ( ... )
PPI::Statement::Variable
[ 1, 7, 7 ] PPI::Token::Word 'my'
[ 1, 10, 10 ] PPI::Token::Symbol '$fh'
[ 1, 13, 13 ] PPI::Token::Operator ','
[ 1, 15, 15 ] PPI::Token::Quote::Double '">"'
[ 1, 18, 18 ] PPI::Token::Operator ','
[ 1, 20, 20 ] PPI::Token::Symbol '$filename'
[ 1, 31, 31 ] PPI::Token::Structure ';'
variables() on the PPI::Statement::Variable returns just '$fh', which seems right to me, but it doesn't seem right that anything following '$fh' is part of the statement.
It also doesn't seem right that initializers for declared variables become part of the PPI::Statement::Variable:
ppidump 'my $x = 1;';
PPI::Document
PPI::Statement::Variable
[ 1, 1, 1 ] PPI::Token::Word 'my'
[ 1, 4, 4 ] PPI::Token::Symbol '$x'
[ 1, 7, 7 ] PPI::Token::Operator '='
[ 1, 9, 9 ] PPI::Token::Number '1'
[ 1, 10, 10 ] PPI::Token::Structure ';'
The absence of a facility like initializers() in PPI::Statement::Variables implies (to me) that including the initializer is not by design.
In playing around with Lexer.pm I found it pretty easy to have a variable declaration without parens stop after it sees the variable:
+ if ( $Statement->isa('PPI::Statement::Variable') ) {
+ my @schildren = $Statement->schildren();
+ if ( @schildren > 1 and !$schildren[1]->isa('PPI::Structure::List') ) {
+ return $self->_rollback( $Token );
+ }
+ }
but from the results it looks like that change is too naive:
ppidump 'open( my $fh, ">", $filename );'
PPI::Document
PPI::Statement
[ 1, 1, 1 ] PPI::Token::Word 'open'
PPI::Structure::List ( ... )
PPI::Statement::Variable
[ 1, 7, 7 ] PPI::Token::Word 'my'
[ 1, 10, 10 ] PPI::Token::Symbol '$fh'
PPI::Statement::Expression
[ 1, 13, 13 ] PPI::Token::Operator ','
[ 1, 15, 15 ] PPI::Token::Quote::Double '">"'
[ 1, 18, 18 ] PPI::Token::Operator ','
[ 1, 20, 20 ] PPI::Token::Symbol '$filename'
[ 1, 31, 31 ] PPI::Token::Structure ';'
I don't know enough about the lexing/parsing to know whether it's just a case of needing a little more logic at statement end/statement begin, whether it's a fundamental problem of a variable declaration being an expression, or what.
The issue of what to do after a statement-that-is-really-an-expression ends with more to parse seem similar to https://rt.cpan.org/Ticket/Display.html?id=36555