psalm
psalm copied to clipboard
Improve performance by avoiding repeated scanning of files included/required only once
I'm not immediately sure how to effectively add a unit test for this. If existing unit tests pass, I'll be tempted to merge it if no one has any concerns after a week or so. This significantly speeds up scans and prevents memory issues on few older projects of ours with a ton of include and/or require calls which Psalm was incorrectly interpreting to be circular. This should resolve them by more accurately mimicking actual include_once() and require_once() behavior, instead of treating them more like include() and require(). Don't hesitate to speak up if anyone has any questions/concerns.
Do you have a before/after benchmark?
I already implemented this with a config key: https://github.com/vimeo/psalm/blob/6.x/src/Psalm/Internal/Analyzer/Statements/Expression/IncludeAnalyzer.php#L169
@staabm, is time command output before this change on our legacy modules/harmonist-hub project:
real 1m30.262s
user 4m53.527s
sys 0m3.208s
...and output after this PR:
real 0m44.634s
user 3m28.629s
sys 0m3.331s
@danog, I did not know about ignoreIncludeSideEffects. It is certainly in similar territory. After considering it a bit, I still think this PR makes sense as it more accurately reflects PHP's actual behavior for require() vs. require_once() calls. What do you think?
@danog, I might give this another week, then merge it if there is no more feedback.
The problem with making this mandatory is that in some cases, side effects are the only reason why a file is included.
Also, with scan order being non-deterministic in multithreaded mode, merging this means a Psalm issue might popup randomly in any of the files where another file is included with include_once/require_once (because we might have already scanned the include before in file A, thus we ignore side effects when including in file B; but file A and B may be inverted depending on the non-deterministic scheduling order); while with ignore_include_side_effects, we always disable side effect evaluation, regardless of whether we included the file before or not.
@danog, you are correct about the order of file loading not necessarily being the same as it would be at runtime. However, this non-deterministic behavior is comparable to composer class loading, which can happen in any number of different orders depending on when each class happens to be referenced at runtime. I think that is good evidence that this is generally a non-issue with PHP codebases.
Also, it seems like it would be a very unusual edge case where this would matter for require_once() or include_once(), since those necessarily only ever produce side effects after the first call. After that, the behavior produced by this PR is closer to what PHP is actually doing than ignore_include_side_effects. In our case I'm a little uncomfortable ignoring side effects from require() and include().
Decided to not merge this after all, as it introduces yet another element of non-determinism, without any strong reason in favor (performance issues can be solved using the newly added flag to ignore include side effects).
I will gladly merge this if it's implemented as another optional flag.
@danog, fair enough. I'll put this back on my TODO list and hopefully circle back to it before too long.
@danog, I've made this an optional flag as requested. What are your thoughts?
@danog, just checking in.
@danog, this has been open three weeks now. If no one is interested/able to review it, I think I will just merge it after another week since you said you'd gladly approve it if it was implemented as an optional flag.