php-ast
php-ast copied to clipboard
Ideas for things to change/remove in the next major version (2.0)
- Remove AST_LIST and mark it as deprecated, and remove it whenever AST version 50 is removed.
Related to #94 - It was overlooked
AST_LIST is still used in https://github.com/phan/phan/blob/1.2.2/src/Phan/Analysis/PreOrderAnalysisVisitor.php#L613 (incorrectly, will fix)
- Change the reflection arginfo of ast\parse_code and ast\parse_file to make the int $version mandatory
Remove legacy flags such as ast\flags\RETURNS_REF
(alias of ast\flags\FUNC_RETURNS_REF
mentioned in README)
As these are functions and not methods, we can change the reflection info anytime.
Arginfo fixed by https://github.com/nikic/php-ast/commit/195036c7eec2ec47afde8b85c4b2caaa28271a0c.
This version does not support whitespaces? it's impossible (bytecode only) or due to some other reasons?
All (?) modern code fixers work with tokenizers, but it's painful. Code fixer based on AST or higher level abstraction model looks way much better:
$this->summary('Make classes `final` by default.')
->apply(fn (ClassNode $c) => $c->abstract ?: $c->finalize());
PhpParser
is kinda slow for these sort of tasks.
Any chance to keep whitespaces somehow?
In a word: No.
In a few more words: This extension only exposes what PHP provides. As PHP has no need for whitespace itself, the information is not preserved (and I expect it will never be).
For this purpose you need to use either PHP-Parser or tolerant-php-parser, both of which preserve full formatting information, though in different ways.
Okay. I have another more or less related question for this topic. It's possible to dump in bytecode id of each closure definition? Just some bytes based on current filename + parser position, for example. And provide a relatively fast way to get it in runtime (directly from Closure
object).
Closure identification needs for systems where there are a lot of closures with dynamic parameter list, and it requires to cache each signature, but there are no fast ways to identify cached closure (definition) besides deprecated ReflectionFunction::export()
.
Also it would solve another known issue when it's impossible to dump closure AST if there are 2 or more on the same line.
Or... just save AST of closure like it works for assert
?
Okay. I have another more or less related question for this topic. It's possible to dump in bytecode id of each closure definition? Just some bytes based on current filename + parser position, for example. And provide a relatively fast way to get it in runtime (directly from Closure object).
This only parses the php code. It doesn't evaluate it, so no closure should/would be created in the resulting ast\Node
.
php-ast already provides declId if you need to uniquely identify a closure's position with a file, and it'll be the same number every time parse_code is run, but that seems unrelated to what you want.
- That uniquely identifies two closures on the same line
Also, the suggestions/questions here are things that can go in a minor version, so they should be different issues.
no closure
I mean AST, text, source code. LIke it works for assert()
(when it fails, it shows first argument in exception description).
That uniquely identifies two closures on the same line
It doesn't help to connect instance of Closure object with one of those nodes, no? Closure dumping is very common task.
There's https://github.com/tpunt/php-ast-reverter , but it hasn't been updated in 3 years with new node kinds and I don't know the status of that project. You could try to create PRs to implement conversion of AST nodes to readable strings.
assert
works on the raw C representation, the converted php representation returned by php would require more error handling, but looking at the C implementation
It doesn't help to connect instance of Closure object with one of those nodes, no? Closure dumping is very common task.
There's no plans to do this and it's not something I consider in scope. Additionally, files can be require()d
multiple times (even after modifying those closures) and those closures would be different. E.g. these would have the same representation and same line
// line 2
$x = null; $c1 = fn() => $x; $x = 2; $c2 = fn() => $x;
information is not preserved
@nikic btw, PHP preserves line numbers, probably it's possible to preserve offset (absolute or for this current line). In this case, it's possible to restore all whitespaces between any 2 nodes.
btw, PHP preserves line numbers, probably it's possible to preserve offset (absolute or for this current line). In this case, it's possible to restore all whitespaces between any 2 nodes.
That's actually something I've wanted for a while. However, that would require changing php-src's lexer, which does not provide the column in the C ast node in older php versions (possibly the latest could be updated to add it, in 8.1). Changing the internals of php-src's lexer is something that may slightly affect performance of large php applications overall, so I was uncertain of whether other people who work on php-src would have interest in it.
- e.g. this could be used to add getColumn() to reflection. I think
Mentioned before in https://github.com/nikic/php-ast/issues/58#issuecomment-309767826
@nikic - https://bugs.php.net/bug.php?id=70024 mentioned that you needed to update bison to support this - Does your merged PR https://github.com/php/php-src/pull/3948 move us closer to accurate column numbers?
I agree, such offset value is very desirable. And php-ast
is really need to be built-in part of PHP 8.
@TysonAndre We ended up reverting the location support, because it had some performance impact.
I agree, such offset value is very desirable. And php-ast is really need to be built-in part of PHP 8.
For the reasons mentioned in https://github.com/nikic/php-ast/issues/5#issuecomment-589973064 , it might be harder but not impossible to write tooling based on php-ast if php-ast was moved into core.
We ended up reverting the location support, because it had some performance impact.
That answers my question. What about the work being done on emitting improved syntax errors? Can that determine the column without performance impact (e.g. unexpected ')' on column X
in php --syntax-check
?